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VLADIMIR ANASHIN 



Abstract. These are lecture notes of a 20- hour course at the Interna- 
tional Summer School Mathematical Methods and Technologies in Com- 
puter Security at Lomonosov Moscow State University, July 9-23, 2006. 

Loosely speaking, a T-function is a map of n-bit words into n-bit 
words such that each i-th bit of image depends only on low-order bits 
0, . . . , i of the pre-image. For example, all arithmetic operations (addi- 
tion, multiplication) are T-functions, all bitwise logical operations (xor, 
and, etc.) are T-functions. Any composition of T-functions is a T- 
function as well. Thus T-functions are natural computer word-oriented 
functions. 

It turns out that T-functions are continuous (and often differen- 
tiable!) functions with respect to the so-called 2-adic distance. This 
observation gives a powerful tool to apply 2-adic analysis to construct 
wide classes of T-functions with provable cryptographic properties (long 
period, balance, uniform distribution, high linear complexity, etc.); these 
functions currently are being used in new generation of fast stream ci- 
phers. We consider these ciphers as specific automata that could be as- 
sociated to dynamical systems on the space of 2-adic integers. From this 
view the lectures could be considered as a course in cryptographic ap- 
plications of the non- Archimedean dynamics; the latter has recently at- 
tracted significant attention in connection with applications to physics, 
biology and cognitive sciences. 

During the course listeners study non-Archimedean machinery and 
its applications to stream cipher design. 



Vladimir Anashin is a Professor and Dean of the Faculty of Information Security at 
the Russian State University for the Humanities. 
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1. Introduction 

1.1. Goals. Imagine we are a team of cryptographers, and we are going to 
design a software-oriented cipher. That is, we are going to combine basic 
microchip instructions to make a very specific transformation of machine 
words. On the one hand, this transformation must be fast; that is, the 
corresponding computer program must achieve high performance. On the 
other hand, this transformation must be secure: Having both an output 
(that is, encrypted text) and the program, it must be infeasible to obtain 
illegally the corresponding input (i.e., plain text). 

At this point, we should understand the following issues: 

• What are these basic instructions? What are reasonable composi- 
tions of these instructions? 

• Could we give an evidence that certain transformation of this kind 
is secure? 

Actually, a goal of the course is to clarify these issues. Moreover, in order 
to make our considerations not too general, and to conclude with some 
practical applications, we restrict ourselves with a certain specific kind of 
ciphers, the so-called stream ciphers. 

1.2. What are stream ciphers? In contemporary digital computers in- 
formation is represented in a binary form, as a sequence of zeros and ones. 
So a plaintext is a sequence ao, a\, 012, ■ ■ ■ , where ^68 = {0,1}. Let 
r = 70,71,72,... be another sequence of zeros and ones, which is known 
both to Alice and Bob, and which is known to no third party. The sequence 
r is called a keystream. To encrypt a plaintext, Alice just XORes it with 
the key: 

ao, cci, «2, ■ ■ ■ , cti, ■ ■ ■ (plaintext) 
(bitwise addition modulo 2) 

7o, 7i, 72, • • • , 7i, ■ ■ ■ (keystream) 

Co, Ci ) C2 , • • • , Ci , • • ■ (encrypted text) 
To decrypt, Bob acts in the opposite order: 

Co , Ci , C2 , • • • , Ci , • • • (encrypted text) 
(bitwise addition modulo 2) 

7o, 7i, 72, • • • , 7i> • • • (keystream) 

ao, cxi, oil, ■ ■ ■ , a ii ■ ■ ■ (plaintext) 

Loosely speaking, Shannon's Theorem states that this encryption is secure 
providing the keystream T is picked at random for each plaintext. In real 
life settings we very rarely could fulfil conditions of Shannon's Theorem, and 
usually we use a pseudorandom keystream T rather than a random one. That 
is, usually in real life ciphers T is produced by a certain algorithm, and T only 
looks like random (e.g., passes certain statistical tests). A pseudorandom 
generator, or a pseudorandom number generator (PRNG) is an algorithm 
that takes a short random string (which is called a key, or a seed) and 
stretches it into a much longer sequence, a keystream. Actually, within the 
scope of the course we speak about stream cipher meaning the latter is 
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Figure 1. Ordinary PRNG 

a pseudorandom generator which is used for encryption according to the 
protocol described above. 

Not every PRNG is suitable for stream encryption. Stream ciphers are 
cryptographically secure PRNG's; that is, they must not only produce statis- 
ticlly good sequences, but also they must withstand cryptoanalyst's attacks. 

2. Preliminaries 

Now we will try to state some of the above mentioned notions more for- 
mally. We start with our main notion, a PRNG. 

2.1. Pseudorandom generators. Basically, a generator we consider dur- 
ing the course is a finite automaton 21 = (N, M, /, F, uq) with a finite state 
set N, state transition (or, state update) function / : N — » N, finite output 
alphabet M, output function F : N —* M and an initial state (seed) uq £ N. 
Thus, this generator (see Figure 1) produces a sequence 

S = {F( U0 ), F(/M), F(/ (2) (n )), • • • , F(f^(u )), . . .} 

over the set M, where 

f^(u ) = f(_ : J{ Uo )...) (J = 1,2,...); /< >(«o)=«o- 

j times 

Automata of the form 21 could be used either as pseudorandom generators 
per se, or as components of more complicated pseudorandom generators, the 
so called counter- dependent generators (see Figure 2); the latter produce 
sequences {^o, zi, Z2, ■ ■ •} over M according to the rule 

z = F (u ),ui = f (u ); ...Zi = Fi(ui),u i+1 = /»(«»); . . . (2.0.1) 

That is, at the (i + l) th step the automaton 2tj = (N,M,fi,Fi,Ui) is ap- 
plied to the state Ui S N, producing a new state Ui + \ = fi(ui) G N, and 
outputting a symbol z% = Fi{ui) £ M. 

Now to make our considerations more practical, we must impose certain 
restrictions on these state update and output functions. As we want our 
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Ui+1 = fi(Ui) 

state update 




Figure 2. Counter-dependent PRNG 



generators to be implemented in software and to demonstrate good perfor- 
mance, these functions can not be arbitrary, they must be finally written as 
more or less short programs. That is, these functions must be represented as 
(not too complicated) compositions of basic instructions of a contemporary 
processor. Then, what are these basic instructions? 

2.2. Basic instructions. A contemporary processor is word-oriented. That 
is, it works with words of zeroes and ones of a certain fixed length n (usually 
n = 8, 16, 32, 64). Each binary word z G B n of length n could be considered 
as a base-2 expansion of a number z G {0, 1, . . . , 2 n — 1} and vise versa: 



We also can identify the set {0, 1, . . . , 2 n — 1} with residues modulo 2 n ; that is 
with the elements of the residue ring Z/2 n Z modulo 2™. Actually, arithmetic 
(numerical) instructions of a processor are just operations of the residue ring 
Z/2 n Z: An n-it word processor performing a single instruction of addition 
(or multiplication) of two n-bit numbers just deletes more significant digits 
of a sum (or of a product) of these numbers thus merely reducing the result 
modulo 2™. Note that to calculate a sum of two integers (i.e., without reduc- 
ing the result modulo 2 n ) a 'standard' processor uses not a single instruction 
but a program tt consists of basic instructions! 

Other sort of basic instructions of a processor are bitwise logical opera- 
tions: XOR, OR, and, not, which are clear from their definitions. It worth 
notice only that the set B™ with respect to XOR could be considered also as 
an n-dimensional vector space over a field Z/2Z = B. 

The third type of instructions could be called a machine ones, since they 
depend on a processor. But usually they include such standard instructions 
as shifts (left and right) and circular rotations of an n-bit word. 

Some more formal sample definitions: Let 



be a base-2 expansion for z G No = {0, 1, 2, . . .}.Then, according to the 
respective definitions, we have 



z = Co + Ci • 2 + C 2 • 2 2 + • • • <— ► C0C1C2 • • • G B 



z = 5 (z) + <5i(z) ■ 2 + S 2 (z) ■ 2 2 + 5 3 {z) • 2 3 + • • • 
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• y XOR z = y © z is a bitwise addition modulo 2: 5j(y XOR z) = 5j(y) + 
8j(z) (mod 2); 

• y and z is a bitwise multiplication modulo 2: 5j (y and z) = 6j(y)-5j(z) 
(mod 2); 

• LfJ> the integral part of |, is a shift towards less significant bits; 

• 2 • z is a shift towards more significant bits; 

• y AND z is masking of z with the mask y; 

• z (mod 2 fc ) = zAND(2 fc — 1) is a reduction of z modulo 2 k 

Let us make the first important observation: 

Basic instructions of a processor, with the exception of rota- 
tions, are well defined on the whole set of positive integers. 

Now we look at the basic instructions from a bit another point. 

2.3. T-functions. From a school textbook algorithm of addition of base-2 
expansions of positive integers it immediately follows that each z-th bit of 
the sum does not depend on higher order bits of summands, i.e., on j-th 
bits with j > i. The same holds for products, bitwise logical operations, and 
shifts towards higher order bits. This observation gives rise to the following 
definition: 

Definition 2.1 (T-function). An (m-variate) T-function is any mapping 

F: (. . .,c4,c4,c4) i ^ (. . . ,<S> 2 (a l ,a{,a l 2 ),$ 1 (a l ,a l 1 ),$ (a l )) 

where oq G B m is a Boolean columnar m-dimensional vector; B = {0,1}; 
<&i : (B m )( i+1 ) — > B ra maps (i + 1) Boolean columnar m-dimensional vectors 
a\, . . . , «q to n-dimensional Boolean vector ^(ajjj, ■ • • , aq). 

For instance, a univariate T-function F: B n — ► B n is a mapping of B n 
into itself such that 

(■ ■ -,X2,xi,Xo) ^ (• • • ;^2(xo,xi,X2); Vt(xo,xi);V>oC\:o)), 

where Xj G {0, 1}, and each ifij(xo, ■ ■ ■ , Xj) is a Boolean function in Boolean 
variables Xo, ■ ■ ■ ,Xj- 
Thus, we state that 

Basic instructions of a processor, with the exception of rota- 
tions and shifts towards low order bits, are T-functions. 

Obviously, a composition of T-functions is a T-function; so while com- 
bining basic instructions into a program, we very often can say that the 
resulting mapping (that is, a program) is a T-function. So, it seems to be a 
good idea to study the above mentioned automata under a restriction that 
both their state update and output functions are T-functions, and try to 
design a stream cipher on their base. 

Few words about terminology: Despite the term 'T-function' was sug- 
gested only in 2002 by A. Klimov and A. Shamir, see [15], these mappings 
are well-known mathematical objects dating back to 1960 th (however, un- 
der other names: Compatible mappings in algebra, determined functions in 
automata theory, triangle boolean mappings in the theory of Boolean func- 
tions, functions that satisfy Lipschitz condition with constant 1 in p-adic 
analysis; see e.g. [19], [25], [4]). Throughout the course we use the term 
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T-function' as the most accepted by cryptographic community; however, 
we will be interested in those properties of T-functions that are explored in 
other areas of mathematics. The mentioned p-adic analysis appears to be 
the most important one. 

2.4. Preparations to p-adic Calculus. We can calculate a sum of two 
positive integers represented by their base-2 expansions with a 'school text- 
book' algorithm. Note that the summands are represented as finite strings 
of O's and l's (or, better to say, as infinite strings of O's and l's that con- 
tain only finite number of l's). Let us look what happens if we apply this 
algorithm to arbitrary infinite strings of O's and l's. 
Consider an example: 

...1 1 1 1 

+ 

...0 1 



...0 

Obviously, the string . . . 000 is merely 0, and the string . . . 001 is 1. But 
then we must conclude that ... Ill = —1; that is, the infinite string ... Ill is 
a base-2 expansion of a negative integer —1. With this in mind, we continue 
our investigations. Let's try multiplication now: 

...0 10 10 1 

x 

...0 1 1 



...0 10 10 1 

+ 

...1 10 1 



...1 11111 

As we know that . . . 0011 = 3, and, as we have agreed, ... Ill = —1, then 
we are forced to conclude that . . .01010101 = — g. This sounds somewhat 
odd for us, but not so for a computer! These calculations could be made 
with an ordinary Windows built-in calculator, up to the best precision it 
admits, 64 bits. 1 

Now denote Z2 the set of all infinite binary strings. We could define 
addition and multiplication on Z2 with the said school-textbook algorithms, 
thus turning Z2 into a ring. Obviously, any T-function is well defined on 
1i2- Summing it up, we conclude that 



Don't forget to switch the calculator into scientific mode and choose bin. 
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Basic processor instructions, with the only exception of ro- 
tations, as well as T-functions, are well defined functions on 
the set 7j2 of all infinite binary sequences; these functions are 
evaluated in Z2. 

As a matter of fact, these functions turn out to be continuous in some 
well-defined sense. Moreover, very often they are differentiable functions, 
and we can use a special sort of Calculus to study their properties that 
are crucial for cryptography with the techniques similar to that of classical 
Calculus. That is what we are going to do within the course. 

What we are thinking about when saying 'Calculus' ? Well, of derivations, 
for instance. And what notion do we use in the definition of a derivative? 
Evidently, a notion of limit. But saying that 'a is a limit of the sequence 
{aj}^ of numbers as i goes to infinity' we just mean that these a% are 
approximations of a, and we can achieve an arbitrarily good precision of 
these approximations by taking sufficiently large i. 

Now we are going to understand what does this 'precision' means, or, 
better to say, what a computer thinks of what 'precision' means. A com- 
puter can not work with arbitrarily long binary words. Actually, its basic 
instructions work with words of certain length, a bitlength. Usual values of 
bitlengths of contemporary processors are 8,16,32, 64. 

Now take some binary string, e.g., a string 1 . . .111 ; that is, a number 

64 times 

2 64 -l = 18446744073709551615. A 8-bit processor can work only with 8-bit 
string, so it can store only 8 less significant bits of this string; that is, the 
number 2 8 - 1 = 255. A 16-bit processor stores 16 bit, that is, the number 
2 16 -1 = 65535; a 32-bit processor stores this string as 2 32 -l = 4294967295, 
etc. It is reasonable to say that 255 is an approximation with 8-bit precision 
of the number 2 64 — 1, 65535 is an approximation with 16-bit precision, etc. 
Following this logic, we finally conclude that the sequence 

255, 65535, 4294967295, . . . , 2 2 " - 1, . . . 

tends to —1 = ...111 as k goes to infinity, and the same does the sequence 

2 2 
2™ — 1. That is, lim (2 n — 1) = —1, where lim is something that behaves like 

n—>oo 

an ordinary limit, but with respect to the 'n-bit precision'. Further, in case 
2 

we want this lim behave similarly to an ordinary limit, we must conclude 
2 

that lim 2 n = 0, which is extremely odd! 2 

To discover the underlying reality, we now must understand on what 
notion is the notion of limit based. Recalling the classical definition, we see 
that the notion of limit is stated in terms of 'how close the two numbers 
are'. That is, the notion of limit is based on the notion of distance! 

The above examples demonstrate that for human beings and for comput- 
ers, 'distance' means quite different things, or, better to say, is measured in 
different ways. For us, human beings, a number 2 32 = 4294967296 lies at a 
bigger distance from than the number 2 8 = 256; on the contrary, 2 32 is 



Not too odd, however. Intuitively, the sequence . . . 0001, . . . 0010, . . . 0100, . . ., which 
is the sequence of base-2 expansions of 1, 2, 4, 8, . . ., tends to . . . 0000 = 0! 
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closer to than 2 8 , for a computer. What a peculiar distance a computer 
uses? 

3. The notion of p-adic integer 

3.1. The notion of distance. Actually, when we measure a distance be- 
tween two points, we associate a non- negative real number to the pair of 
points. Obviously, this number is if and only if these points coincide, and 
the distance measured from the first point to the second one is equal to the 
distance measured in the opposite direction, from the second point towards 
the first. The distance obeys the 'law of a triangle'; that is, the distance 
from the first point A to the second point B is not greater than the sum 
of two distances, from the first point A to an arbitrary third point C, and 
from this third point C to the point B. These observations are summarized 
in the following definition'^: 

Definition 3.1 (Metric). Let M be a non-empty set, and let d: M x M — > 
M>o be a function valuated in non-negative real numbers. The function d 
is called a metric (and M is called a metric space) whenever d obeys the 
following laws: 

(1) For every pair a,b G M, d(a, b) = if and only if a = b. 

(2) For every pair a, b £ M, d(a, b) = d(b, a). 

(3) For every triple a,b,c £ M, d(a, b) < d(a, c) + d(c, b). 

For example, the set M of all real numbers is a metric space with metric 
d(a, b) = \a — b\, where | • | is absolute value. The latter notion also could be 
defined for arbitrary commutative ring R. 

Definition 3.2 (Norm). A function || • || defined on the R and valuated in 
IR>o is called a norm whenever || • || satisfies the following conditions: 

(1) For every a € R, \\a\\ = if and only if a = 0. 

(2) For every pair a,b £ R, \\a ■ b\\ = \\a\\ ■ \\b\\. 

(3) For every pair a,b £ R, \\a + b\\ < \\a\\ + ||6||. 

It is easy to verify that assuming d(a, b) = \\a — b\\ we define metric d on 
the ring R. This metric d is called a metric induced by the norm || • ||. 

Note that once the norm (whence, metric) on the ring R is defined, we 
immediately define a notion of convergent sequence over R, a notion of limit, 
a notion of continuous function defined on R and valuated in R, a notion of 
derivative of a function, etc. For instance, element a £ R is a derivative of 
the function / : R — > R at the point x £ R if and only if for all sufficiently 
small h £ R, h ^ 0, (that is, for \\h\\ < 5 for some real 5 > 0) 

f(x + h) = f(x) + a-h + X(h), 

where W goes to as \\h\\ goes to 0. Thus, loosely speaking, every new 
norm leads to a new Calculus. 



Mathematicians used to speak of metric rather than of distance, but distance is also 

OK 
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3.2. Norms on Z. We know that absolute value | • | is a norm on the ring Z 
of all integers. The question arises, is | • | the only norm on Z? Surprisingly, 
not! 

Let p be a prime number. Using this p, we define now a norm || • || p on Z. 
Obviously, since || — a\\ = \\a\\ for every a £ R (this is an exercise to deduce 
the identity from Definition 3.2!), it suffices to define || • || p on the set No of 
all non-negative integers. We assume ||0|| p = 0. Now, if n > is a natural 
number, it has a unique representation as a product of powers of pairwise 
distinct primes. Denote ord p n exponent of p in this representation and put 
\\ n \\p = p~ ord P n . It is an exercise to verify that the so defined function is a 
norm. 

Indeed, (1) and (2) of Definition 3.2 obviously hold for the so defined 
norm. Moreover, (3) holds in a stronger form: 

(3') For every pair a, b £ Z, \\a + 6|| p < max{||a|| p , ||fe|| p }- 

From here it obviously follows that the metric d p defined by the norm || • || p 
also satisfies a stronger relation than (3) of Definition 3.1: 

(3') For every triple a, b, c £ Z, d p (a, b) < max{d p (a, c),d p (c, b)}. 

The latter relation is called a strong triangle inequality, and a metric that 
satisfies this inequality is called a non- Archimedean metric, or an ultramet- 
ric. Accordingly, a metric space equipped with a non- Archimedean metric 
is called a non- Archimedean metric space, or an ultrametric space. 

3.3. p-adic integers. Clearly, for natural n £ N one can calculate ord p ra 
according to the following rule: Represent n in its base-p expansion, find the 
least significant non-zero digit (let it be the i-th digit; enumeration starts 
with zero); then ord p n = i. That is, 

1 

n = . . . a-i+iai (L^Ay, aj =^ ||n|| p = — . 

i zeros 

The latter definition could be expanded on the whole set Z p of infinite 
strings of digits 0, 1, ... ,p — 1 in an obvious manner. Now it is not difficult 
to prove that the set Z p is a commutative ring with respect to addition and 
multiplication defined by 'school-textbook' algorithms, and, moreover, the 
so defined function || • || p is a norm on this ring! 1 Elements of the ring Z p are 
called p-adic integers. Actually, we think of the infinite string . . . ajaj_i . . . ao 
over an alphabet {0, 1, ... ,p — 1} as of base-p expansion of a p-adic integer 
a: 

oo 

a = . . . aiai-i . . . a = ^ a« ■ p 1 (3.2.1) 

i=0 

Note that for a, b € Z p d p (a,b) = ^ for some i = 0, 1,2, ...,oo (case 
i = oo just means that d p (a, b) = 0, whence, a = b). Moreover, d p (a, b) = 
if and only if 

a = . . . a.j + iajCj_i . . . Co; 
b = ... b i+ ibiCi-i ... c , 



Prove this. 
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and Oj ^ 6j. Using an obvious analogy with non- negative rational integers 
we write in this case that a = b (mod p l ). Thus, d p (a, b) = i where i is the 

biggest non-negative rational integer such that a = b (mod p l ), and a ^ b 
(mod p t+l ). Throughout the course we denote the z-th digit (i = 0, 1, 2, . . .) 
in a base-p expansion of a p-adic integer a £ Z p via <5f (a); that is, Sf (a) = Oj, 
cf. (3.2.1). We omit the superscript (especially in case p = 2) when it does 
not lead to misunderstandings. 

The ring 7Li of infinite binary strings mentioned above corresponds to the 
case p = 2. Thus, Z2 is an ultrametric space with respect to the metric d,2 
defined by the norm || • H2. And, indeed, with respect to this metric di the 
sequence 1, 2, 4, . . . , 2 n , . . . converges to as n goes to infinity; whence , the 
sequence 1, 3, 7, . . . , 2 n — 1, . . . indeed converges to —1. 

Actually a processor works with approximations of 2-adic integers with 
respect to 2-adic metric: When one tries to load a number which base-2 
expansion contains more than n significant bits into a registry of an n- 
processor, the processor just writes only n low order bits of the number 
in a registry thus reducing the number modulo 2 n . Thus, precision of the 
approximation is defined by the bitlength of the processor. 

Since the ring (metric space) Z2 is of most importance for us, we proceed 
with some examples that illustrate our main notions with respect to 7Li- 

Sequences that contain only finite number of l's correspond to non- 
negative rational integers represented by their base-2 expansions: 



Sequences that contain only finite number of 0's correspond to negative 
rational integers 6 : 



Sequences that are (eventually) periodic correspond to rational numbers 
that could be represented by irreducible fractions with odd denominators': 



That is, -\ = 5 (mod 16); -\ ^ 5 (mod 32). 

3.4. Odd world. Finally we conclude that our computers live in the world 
other than we human beings. This virtual world is very odd. In this subsec- 
tion we only mention some facts about this virtual world to make it more 
familiar to us. Proofs (and other peculiar facts) could be found in the above 
mentioned books and monographs on p-adic analysis. 

^To prove this we must prove a theorem on limit of sum of two convergent sequences 
before. It is a good exercise to re-prove all classical theorems about limits of compositions 
of sequences in general case, for arbitrary metric! 

6 Prove this 

'''Prove this 



...00011 = 3 



...111100 



= -4 



...1010101 = -- 

Non-periodic sequence correspond to no rational number. 
An example one how we measure distances in Z2: 
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Our world, the world of real numbers R is Archimedean. That is, it 
satisfies the Archimedean Axiom which read: 

Given a segment S of real line of length s, and another 
(smaller) segment L of length £, i < s, there exists a natural 
number n such that n ■ I > s. (That is, if we append a short 
segment to itself sufficient number of times, we can make the 
resulting segment arbitrarily long). 

This axiom does not hold in the p-adic world Z p : Appending a segment 
to itself we could make the resulting segment shorter than the original one! 
For instance, let p = 2 and let L be some 'segment of length i', say, L = 2 
then doubling the segment ('appending' it to itself) we, obtain a 'segment' 
2 • L = 4, and for which we have ||4||2 = \- The 'doubled segment' is twice 
as short as the original! 

Of course, origin of this fact is hidden in a strong triangle inequality 
(3') that governs the non- Archimedean world. This inequality implies other 
odd-looking facts, e.g., 

• All triangles are isosceles! 

• Every point inside a ball is a center of this ball! 

• The series X}£o z * °^ P" acu c integers are convergent if and only if 

v v 
lim Zi = (where lim ) is a limit with respect to the p-adic norm 

i— >oo i^oo 

By the way, this implies that, say, ln(— 3) = — Ylili T is a 2-adic integer! 

If you are going to prove these statements (which is a good exercise!) note 
that every ball of radius in Z p is of the form a+p k -"L p = {a+p k -z: z £ 
By the way, from here it follows that, in case p = 2, a boundary of a (closed) 
ball is itself a ball of radius ^+t', e.g., a sphere of radius \ is a ball of radius 

\ ! Actually, the whole metric space Z p is a ball of radius 1 (and is a p-adic 
analog of a real unit interval). For those who is familiar with functional 
analysis we mention also that the space Z p is complete with respect to the 
p-adic distance (metric) d p , and compact. 



4. Elements of applied 2-adic analysis 

The main goal of this section is to provide some experience in Calculus 
on 7*2- We are not going to do this too formally since there are a number 
of excellent books and monographs on p-adic analysis, e.g. [24, 20, 17, 12]. 
We rather focus on those functions and techniques that later in the course 
will be used in our cryptographic applications, stream cipher design. 

4.1. T-functions revisited. We start with 2-adic extensions of what we 
called 'basic instructions'. These are primarily arithmetic operations (ad- 
dition, subtraction, multiplication) and bitwise logical operations. These 
two set of operations are not mutually independent, some of them could 
be expressed via others. The following identities could be proved: For all 
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U, V G Z 2 

uoj(u) = uxor(-I); 
not(-u) + u = -1; 

uxorv = u + v — 2(uandi;); (4.0.2) 

MORU = « + li - (u AND v)\ 
UORV = {uXORv) + (u AND v). 

During the course we often write © instead of XOR, also 0, & or A in- 
stead of AND, and V instead of OR. These operations (with the only ex- 
ception of not) are functions of two 2-adic variables. To work with these 
functions we need to define 2-adic metric on a Cartesian square Z|. Hav- 
ing already defined metric on Z2 we define metric on a Cartesian product 
Z?> = Z2 x • • • x Z2 in a standard manner: For a = (a*- 1 **, . . . , a^), b = 

n times 

... e Zg we put 1 1 £l| 1 2 = niaxjUa^ 1 ) H2, . . . , ||a( n )|| 2 } and, respec- 

tively, d 2 (a,b) = max{d 2 (a (1) ,^ (1) ), • • • ,d 2 (a^ n) ,6 (n) )}. We also write a = b 
(mod 2 l ) whenever a^> = (mod 2 l ) for all j = 1,2, ... ,n. 

Now it is a right time to consider T-functions as 2-adic mappings. Actu- 
ally (see Definition 2.1) we define T-function as a special mapping that puts 
into a correspondence to every sequence of columnar m-dimensional Boolean 
vectors certain sequence of n-dimensional columnar Boolean vectors. Now 
we can read these sequence not column after column, but as a row after a 
row, starting with a top one. Each this row is an infinite sequence of zeros 
and ones; that is, a 2-adic integer. Thus, 

we can consider a T-function F from Definition 2.1 as a 
mapping from Z™ into Zg such that F(a) = F(h) (mod 2 l ) 
whenever a = b (mod 2 l ). 
From this observation immediately follows a very important theorem: 

Theorem 4.1. T-functions are mappings from ZJJ 1 into Z?? that satisfy Lip- 
schitz condition with a constant 1: 

||F(a)-F(b)|| 2 < ||a-b|| 2 

and vise versa, all mappings that satisfy this condition are T-functions. 

Corollary 4.2. All T-functions are continuous 2-adic functions.^ 

These easy claims are a hint that 2-adic analysis could be useful in study 
of T-functions; of course, only of properties that are of 'analytic nature', 
which could be properly stated in terms of analysis; that is, in terms of 
limits, convergence, derivatives, etc. We have not stated still what are these 
properties of T-functions that are crucial for cryptography. Yet, when we 
state these properties a bit later, we see that fortunately they are of this 
'analytic nature'. 

By the way, the above observation reflects a very specific algebraic nature 
of T-functions. In general algebra, a congruence of an algebraic system is an 
equivalence relation which is preserved by all operations of this system; that 



^Any function that satisfy Lipschitz condition with respect to a certain metric is con- 
tinuous with respect to this metric. Prove this! 
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is, if replacing operands by equivalent elements the result of the operation 
is equivalent to the original one. A function defined on (and valuated in) 
the algebraic system is called compatible whenever this function preserves all 
congruences of this algebraic system. The only congruences of the ring 7L V 
are congruences modulo p k for k = 1,2,.... Thus, T-functions are merely 
compatible functions on the ringZ2, so we start using the term 'compatible' 
along with (or instead of) the term 'T- function'. 

Actually, 'T-function' just means 'compatible on the ring Z2', and many 
further results holds for functions that are compatible on Z p , p prime. A 
p-adic compatible function is the function that satisfies p-adic Lipschitz con- 
dition with a constant 1, and vise versa. 

4.2. More compatible functions. We already know that arithmetic oper- 
ations (addition, subtraction, and multiplication), as well as bitwise logical 
operations (xor, and, etc.) are T-functions (that is, compatible 2-adic func- 
tions). Obviously, a composition of compatible functions is a compatible 
function. Whence, natural examples of compatible functions are polyno- 
mials with p-adic integer coefficients. That is, all polynomials with integer 
coefficients are T-functions! 

With some extra efforts one could prove also that some other 'natural' 
functions are also T-functions: 

exponentiation, f: (u, v ) 1— ► u f v = (1 + 2 • u) v ; in particular, 

raising to negative powers, u j (— r) = (1 + 2 • u)~ r ,r G N; and ^ 2 1) 

u 

division, / : u/v = u ■ (v | (—1)) = — . 

1 ~~\~ '2, ' u 

That is, these functions are well defined on Z2, and satisfy 2-adic Lipschitz 
condition with a constant 1. Use of compositions of these functions with 
the above mentioned bitwise logical instruction results in very wild-lloking 
functions, like this one: 



Despite this function could be easily evaluated on every digital computer 
(since this function is continuous in a computer's 2-adic world), we do not 
insist on using it (and similar) functions in applications: Compositions of 
the above mentioned functions may not be of big importance for cryptog- 
raphy since their program implementations are usually slow, yet they are 
of theoretical interest and often arise in studies. The p-adic analogs of the 
above functions could be naturally defined (write p instead of 2). 

It also worth notice here that (1 + p ■ v)^ 1 = ^°^ ( — ^T +1 P % v\ and the 
series in the right-hand part of this equality are convergent for every v £ Z p . 

We can describe univariate T-functions in some general way. It turns out 
that each function / : No — > Z p (or, respectively, / : No — ► Z) admits one and 
only one representation in the form of so-called Mahler interpolation series 





i=0 



(4.2.2) 



NON-ARCHIMEDEAN ANALYSIS, T-FUNCTIONS, AND CRYPTOGRAPHY 15 



where (*) = ^-iM^+i) for i = 1,2,. . ., and (*) = 1; a, € Z p (respec- 
tively, a, S Z), i = 0, 1, 2, . . . . 

If / is uniformly continuous on No with respect to p-adic distance, it can 
be uniquely expanded to a uniformly continuous function on Z p . Hence 
the interpolation series for / converges uniformly on Z„. The following is 

true: The series f(x) = S£o a *(i)> ( a * ^ ^P' & = 0,1,2, ... ) converges 

p p 
uniformly on Z p iff lim a, = 0, where lim is a limit with respect to the p-adic 

i— >oo 

distance; hence uniformly convergent series defines a uniformly continuous 
function on Z p . 

The following theorem holds: 

Theorem 4.3. T/ie function f ' :7L V — > Z p represented by (4.2.2) is compat- 
ible if and only if 

ai = (modpLi°gp*J) 

for all i = p,p+ l,p + 2, . . . . (Here and after for a real a we denote [a\ an 
integral part of a, i.e., the nearest to a rational integer not exceeding a.) 

4.3. Derivatives modulo p k . In this subsection we generalize the main 
notion of Calculus, a derivative. By the definition, for a = (a±, . . . ,a n ) 
and b = (pi, . . . ,b n ) of Z p n ^ the congruence a = b (mod p s ) means that 
— < ( or ; the same, that a% = bi + Cjp s for suitable Cj 6 Z p , 
i = 1, 2, . . . , s); that is ||a — b|| p < p~ s . 

Definition 4.4 (Derivations modulo p k ). A function 

F = (/ 1 ,...,/ m ):Z^Z- 

is called differentiable modulo p k at the point u = {u\, . . . , u n ) G Z™ iff there 
exist a positive integer rational N and annxm matrix F^(u) over Z p (which 
is called the Jacobi matrix modulo p k of the function F at the point u) such 
that for each positive rational integer K > N and each h = (hi, . . . ,h n ) £ Z" 
the inequality ||h|L < p~ K implies a congruence 

F(u + h) = F(u) +h-F^(u) (modp k+K ). (4.4.1) 

In case m = 1 the Jacobi matrix modulo p k is called a differential modulo 
p k . In case m = n a determinant of the Jacobi matrix modulo p k is called the 
Jacobian modulo p k . The entries of the Jacobi matrix modulo p k are called 
partial derivatives modulo p k of the function F at the point u. A partial 
derivative (respectively, a differential) modulo p k we sometimes denote as 
(respectively, as d k F(u) = £? = i %5r^)- 

It could be proved that whenever i* 1 is compatible, then, if F is differ- 
entiable modulo p k at some point, the entries of the Jacobi matrix are 
necessarily p-adic integers (such functions are said to have integer-valued 
derivative) . 

Since the notion of function that is differentiable modulo p k is of high 
importance in theory that follows, we discuss this notion in details. First of 
all, we compare this notion to a classical notion of differentiable function. 
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Compare to differentiability, the differentiability modulo p k is a weaker 
restriction. As a matter of fact, in a univariate case (m = n = 1), definition 
4.4 just yields that 

F(u + h)-F(u) 
k 

Note that this ~ ('approximately') implies the following: 

~ with arbitrarily high precision => differentiability; 

~ with precision not worse than p~ k =>• differentiability modp k . 

It is obvious that whenever a function is differentiable (and its derivative 
is a p-adic integer), it is differentiable modulo p k for all k = 1, 2, . . ., and in 
this case the derivative modulo p k is just a reduction of a derivative modulo 
p k (note that according to definition 4.4 partial derivatives modulo p k are 
determined up to a summand that is modulo p k ). 

For functions with integer-valued derivatives modulo p k the 'rules of 
derivation modulo p kl have the same (up to congruence modulo p k in- 
stead of equality) form as for classical derivations. For instance, if both 
functions G : Z* — > Z™ and F: Z™ — > Z™ are differentiable modulo p k at 
the points, respectively, v = (v%, . . . ,v s ) and u = G(v), and their partial 
derivatives modulo p k at these points are p-adic integers, then a composition 
F o G : Z* — > Z™ of these functions is uniformly differentiable modulo p k at 
the point v, all its partial derivatives modulo p k at this point are p-adic 
integers, and (F o G%(v) = G' fc (v)F^(u) (mod p k ). 

By the analogy with classical case we can give the following 

Definition 4.5. A function F: Z™ — > Z™ is said to be uniformly differ- 
entiable modulo p k on Z p n ^ iff there exists K G N such that (4.4.1) holds 
simultaneously for all u £ as soon as \\hi\\p < p - ^, (i = 1, 2, . . . , n). The 
least such K G N is denoted via N^{F). 

It could be shown that all partial derivatives modulo p k of a uniformly dif- 
ferentiable modulo p k function F are periodic functions with period p Nk ^ 
(see [3, Proposition 2.12]). This in particular implies that each partial de- 
rivative modulo p k could be considered as a function defined on the residue 
ring Z/p Nk ^ F '1d modulo p Nk ( F >. Moreover, if a continuation F of the func- 
tion F = (fi, . . . , f m ): Nq — > Nq 1 to the space Z™ is uniformly differen- 
tiable modulo p k on the Z™, then one could continue both the function F 
and all its (partial) derivatives modulo p k to the space Z™ simultaneously. 
This implies that we could study if necessary (partial) derivatives modulo 
p k of the function F instead of studying those of F and vise versa. For 
example, a partial derivative dfc/»( u ) modulo p k vanishes modulo p k at no 

point of Z™ (that is, ^ (mod p k ) for all u G Z™, or, the same 

II k Jkx lip ^ P k evervwriere on ^p) if an d only if dk J^ u ^ ^ (mod p k ) for 
all u G {0,1,... ,p N k(F) _ i}. 

To calculate a derivative of, for instance, a T-function that is a com- 
position of basic instructions one needs to know derivatives of these basic 
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instructions (i.e., arithmetic, bitwise logical, etc.) Thus, we briefly introduce 
a p-adic analog of a 'table of derivatives' of classical Calculus. 

Examples 4.6. Derivatives of bitwise logical operations. 

(1) the function f{x) = x AND c is uniformly differentiable on Z2 for 
any c £ Z; f'{x) = for c > 0, and f'(x) = 1 for c < 0, since 
f(x + 2 n s) = f \x), and f(x + 2 n s) = f(x) + 2 n s for n > l(\c\), where 
Z(|c|) is the bit length of absolute value of c (mind that for c > 
the 2-adic representation of — c starts with 2 1 ^ — c in less significant 
bits followed by ... 11: -1 = . . . 11, -3 = . . . 11101 , etc.). 

(2) the function f(x) = xxorc is uniformly differentiable on Z2 for any 
c e Z; /'(a?) = 1 /or c > 0, and /'(x) = -1 /or c < 0. This 
immediately follows from (1) since uxoru = u + v — 2(x ANDti) (see 
(4.0.2)); thus (xxorc)' = x' + d - 2(xandc)' = 1 + 2 • (0, for c > 
0; or - 1, for c < 0). 

(3) in the same manner it could be shown that functions (x mod 2 n ) = 
x and(2™ — 1) (a reduction modulo 2 n ), not(x) and (x OR c) for c £ Z 
are uniformly differentiable on Z2, ana" (a; mod 2™)' = 0, (notx)' = 
-1, (iORc)' = 1 /or c > 0, (iORc)' = for c < 0. 

(4) the function f(x,y) = xxORy is not uniformly differentiable on Z|, 
yei it zs uniformly differentiable modulo 2 on Z|; from (2) it follows 
that its partial derivatives modulo 2 are 1 everywhere on Z?. 



Here how it works altogether: 

Example. The function /(x) = x + (x 2 OR5) is uniformly differentiable on 
Z 2 , and f'(x) = 1 + 2x • (x OR 5)' = 1 + 2x. 

The function F(x, y) = (/(x, y),g(x, y)) = (x © 2(x A y), (y + 3x 3 ) © x) 
is uniformly differentiable modulo 2 as bivariate function, and N\(F) = 1; 
namely 

F(x + 2"t, y + 2 m s) = F{x, y) + (2 n t, 2 m s) Yj X + ^ (mod 2 fe+1 ) 

/l x + l\ 

for all m, n > 1 (here = min{m, n}). The matrix f ^ J = F[{x,y) 

is Jacobi matrix modulo 2 of F; here how we calculate partial deriva- 
tives modulo 2: for instance, = «*gf^ • ^U +32 a + |f ■ 

9i( q® x) \ u=y+3x3 =9x 2 -l + l- l = x + l (mod 2). Note that a partial deriv- 
ative modulo 2 of the function 2(x A y) is always modulo 2 because of the 
multiplier 2: The function x Ay is not differentiable modulo 2 as bivariate 
function, yet 2(x A y) is. So the Jacobian of the function F is det F[ = 1 
(mod 2). 

Now let F = (/i,...,/ m ): Z£ -> Z™ and /: Z™ Z p be compatible 
functions, which are uniformly differentiable on Z™ modulo p. This is a 
relatively weak restriction since all uniformly differentiable on Z™ functions, 
as well as functions, which are uniformly differentiable on Z™ modulo p k 
for some k > 1, are uniformly differentiable on Z™ modulo p; note that 

4— = 7^— = a fc ~* (mod rr ). Moreover, as it was mentioned, all values 
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of all partial derivatives modulo p k (and thus, modulo p) of F and / are 
p-adic integers everywhere on, respectively, Z™ and Z p , so to calculate these 
values one can use the techniques considered above. 



In this section we discuss what conditions state update and output func- 
tions of a pseudorandom generator should satisfy to guarantee some crucial 
cryptographic properties of the produced sequence. It turns out that when- 
ever these functions are T-functions, the properties are tightly connected 
with the behaviour of the functions with respect to a natural probabilistic 
measure on the space Z2. We start with defining this measure. 

5.1. Notions of p-adic dynamics. When we measure a square of a figure 
on a plane (or a volume of a body in a space), we associate a real number 
to the figure (resp., to the body). These are natural examples of measures. 
We are not going to recall basic notions of measure theory here, referring to 
any book on this topic. We only mention that we could define a measure 
[i on some set S by assigning non-negative real numbers to some subsets 
that are called elementary. All other measurable subsets are compositions of 
these elementary subsets with respect to countable unions, intersections, and 
complements. Actually, if a measurable subset S C § is a disjoint union of 
elementary measurable subsets Ej, S = UJ^qEj, then fi(S) = ^2%=o mC%)j 
and the series in the right-hand part must be convergent. The set S with so 
defined measure fi is called a measurable space. 

The elementary subsets in Z p are balls B p -k(a) = a + p k Z p . To each such 
ball we assign a number /j, p (B p -k(a)) = It could be verified that we 
indeed define a measure on the space Z p , and this measure is a probabilistic 
measure, /i p (Z p ) = 1. This measure [i p is called a (normalized) Haar measure 
on Z p . 

We say that we have a dynamical system on a measurable space 8, when- 
ever we consider a triple (S; /), where § is a measurable space with mea- 
sure n, and /: § — > S is a measurable function; that is, an /-preimage of 
every measurable subset is a measurable subset. Dynamical system theory 
is a reach mathematical theory which is applied in different parts of science 
and industry. As a matter of fact, in this course we will discuss applications 
of 2-adic dynamical systems theory to stream cipher design. 

A trajectory of a dynamical system is a sequence 



of points of the space S, xq is called an initial point of the trajectory. If 
F: § — > T is a measurable mapping to some other measurable space T with 
a measure v (that is, an F-preimage of any u- measurable subset of T is a 
//-measurable subset of X), the sequence F(xo), F(x±), F(x2), ■ ■ ■ is called 
an observable. Note that the trajectory formally looks like the sequence of 
states of a pseudorandom generator, whereas the observable resembles the 
output sequence, cf. subsection 2.1. Further we will see that is not just an 
analogy. 

The two important notions of dynamical systems theory follow: A map- 
ping F : § — ► Y of a measurable space § into a measurable space Y endowed 



5. Stream ciphers and 2-adic ergodic theory 
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with probabilistic measure [i and u, respectively, is said to be measure- 
preserving (or, sometimes, equiprobable) whenever fi(F~ 1 (S)) = u(S) for 
each measurable subset S C Y. In case 8 = Y and fj, = v, a measure- 
preserving mapping F is said to be ergodic whenever for each measurable 
subset 5 such that F~ 1 (S) = S holds either = 1 or fi(S) = 0. Loosely 
speaking, any invariant set of the ergodic mapping is either nothing, or 
everything. 

The p-adic ergodic theory studies ergodic (with respect to the Haar mea- 
sure) transformations of the space of p-adic numbers, conditions that provide 
ergodicity, etc. It is a rapidly developing mathematical theory, with vari- 
ous applications, see e.g. [13]. Actually, as we will see, the course is a 
development of p-adic ergodic theory with special interest to pseudorandom 
number generators (particulary, stream ciphers). 9 And now it is a right time 
to discuss how the above notions are related to properties of pseudorandom 
generators. 

5.2. What is a good PRNG. A PRNG which could be considered any 
good obviously must meet the following conditions: 

• The output sequence must be pseudorandom (i.e., must pass certain 
statistical tests). 

• For cryptographic applications, given a segment Zj, Zj+i, . . . , Zj+s-i 
of the output sequence, finding the corresponding initial state (which 
usually is a key) must be infeasible in some properly defined sense. 

• The PRNG must be suitable for software (or hardware) implemen- 
tation; the performance must be sufficiently fast. 

In case the PRNG is an automaton described by Figure 1 we could re-state 
these conditions as follows: 

First of all, we state 

Condition 1: The state update function / must provide pseu- 
dorandomness; in particular, it must guarantee uniform dis- 
tribution and long period of the state update sequence {uj}. 
It would be great if this sequence is secure; that is, given itj, it is infeasible 
neither to find (or to predict) Uj + i, nor to find uq. Unfortunately, this is 
not easy to provide these properties: Generators that are 'provably secure', 
that is, supplied with proofs (which are based on some plausible, yet still 
unproven conjectures) that their output sequences can not be predicted by 
polynomial-time algorithms, are too slow for most practical applications. 
In real life one has to undertake additional efforts to make the algorithm 
secure. Usually this could be achieved with the use of the output function. 
Thus, we need 

Condition 2: The output function F must not spoil pseu- 
dorandomness (at least, the output sequence {zi} must be 
uniformly distributed and must have long period). 

Moreover, in cryptographic applications the function F 
must make the PRNG secure: (in particular, given Zi, it 
must be difficult to find Ui from the equation Z{ = F(iii)). 



By the way, methods developed within this approach could be applied to solve some 
problems of p-adic ergodic theory, see [1] 
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Finally, we can formulate 

Condition 3: To make the PRNG any suitable for software/hardware 
implementations, both f and G must be compositions of ba- 
sic processor instructions. 

In section 2 we already have discussed how one could satisfy condition 3: 
It is sufficient to choose both / and F from the class of T-functions. Thus, 
we can assume that /: Z/2 n Z -► Z/2 n Z and F: Z/2 n Z Z/2 m Z (usually, 
m < n). 

Now, to satisfy condition 1, one could take the state update function 
/: Z/2 n Z — ► Z/2 n Z with a single cycle property] that is, / permutes ele- 
ments of Z/2 n Z cyclically. 

The state update sequence 

«o, ui = f{u ),... ,u i+1 = f(ui) = f +1 (u ),... 

of ra-bit words will have then the longest possible period (of length 2 n ), and 
strict uniform distribution; that is, each n-bit word will occur at the period 
exactly once. 

To satisfy the first part of condition 2, one could take the output func- 
tion F: Z/2 n Z -> Z/2 m Z to be balanced: That is, to each m-bit word the 
mapping F maps the same number of n-bit words (that's why m < n). 
For m = n balanced mappings are just invertible (that is, bijective, one- 
to-one) mappings. Obviously, if a balanced output function is applied to a 
strictly uniformly distributed sequence of states, the output sequence (of m- 
bit words) is also strictly uniformly distributed: It is periodic with a period 
of length 2 n , and each m-bit word occurs at the period exactly 2 n ~ m times. 

For m <C n, balanced functions could serve us to satisfy the second part 
of condition 2, since the equation yi = G(xi) has too many solutions then, 
2 n ~ m (so it is infeasible to an attacker to try them all). 

Thus, we must know how to construct balanced (or single-cycle) functions 
out of basic processor instructions. This is where the non- Archimedean 
analysis comes into play! 

5.3. A bridge. Now we make our studies more formal. Let F : Z™ — ► Z™ be 
a compatible function; that is, let F satisfy the p-adic Lipschitz condition 
with a constant 1 (see section 4). In other words, for every k = 1,2,..., 
and for every a, b E Z™, F(a) = F(h) (mod p k ) whenever a = b (mod p k ) 
(see subsection 4.3 for the definition of modp k ). This means that, given a 
compatible mapping F: Z™ — > Z™, its reduction F modp k modulo p k is a 
well dehned mapping 

F mod p k : (Z//Z) n -» (Z/p k Z) m 

of respective Cartesian powers of the residue ring Z/p k Z. We call the map- 
ping F mod p k the induced mapping. The idea is quite clear: Reduction 
modulo p k just deletes all most significant digits (starting with the k-th. 
digit) both of arguments and of values of the function F. 

Definition 5.1. A compatible mapping F: Z p — > Z p is said to be bijective 
(resp., transitive) modulo p k iff the induced mapping x <— > F(x) (mod p k ) is 
a (single-cycle) permutation of the elements of the ring TLjp k TL. 
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Balance modulo p k could be defined by an analogy. Now we can state the 
central result of this section: 

Theorem 5.2 (see [5]). For m = n = 1, a compatible mapping F:TU^-^ 
preserves the normalized Haar measure /j, p on Z p (resp., is ergodic with 
respect to n p ) if and only if it is bijective (resp., transitive) modulo p k for 
all k = 1, 2, 3, . . . . 

For n > m, the mapping F preserves the measure /j> p if and only if it in- 
duces a balanced mapping of (Z/p fc Z) n onto (Z/p fc Z) m , for all k = 1, 2, 3, . . . . 

This theorem acts like a bridge between p-adic ergodic theory and stream 
cipher design: We consider the corresponding PRNG as approximation with 
respect to 2-adic metric of some ergodic dynamical system on 2-adic integers. 
In a pseudorandom generator, we can take compatible ergodic functions 
for state update functions; also we can take compatible measure-preserving 
functions for output functions. The reduction modulo 2 n a computer per- 
forms automatically. In particular, for p = 2 from theorem 5.2 we obtain: 

• measure preservation = invertibility modulo 2 k for all k G N; 

• in dimensions > 1, i.e., for F: ZJ? — * Z™, 

measure preservation = balance modulo 2 k for all /i£N; 

• ergodicity = single cycle property modulo 2 k for all k € N. 

In other words, a compatible function F: Z2 — > Z2 is measure-preserving (re- 
spectively, ergodic) if and only if the corresponding T-function F (mod 2 n ) 
on n-bit words (which is merely an approximation of F with precision 7^-) 
is invertible or, respectively, has a single cycle property! 

Now the problem is how to describe these measure-preserving (in partic- 
ular, ergodic) mappings in the class of all compatible mappings. We start 
to develop some theory to answer the following questions: What composi- 
tions of basic instructions are measure-preserving? are ergodic? Given a 
composition of basic instructions, is it measure-preserving? is it ergodic? 

6. Tools 

The main goal of this section is to describe some tools with the use of 
which we could answer the above stated questions. However, we start with 
some historical observations. 

6.1. A phenomenon. Study of pseudorandom generators has a long his- 
tory. You can read about this issue in, for instance, an excellent book of 
Donald Knuth [16]. Here we discuss briefly a short passage of this long story, 
aiming to make some important observations. 

One could notice that behavior of a mapping modulo p N , where N is big, 
is totally determined by the behavior of this mapping modulo p n , where n 
IS small. One of the first generators that demonstrate this behaviour is 

Linear Congruential Generator (Hull and Dobell, 1962): 

The mapping 

x 1— » a ■ x + b (mod p N ), 

where a, b E Z, N > 2, is a permutation with a single cycle property if and 
only if x 1— > a ■ x + b (mod p n ) is a permutation with a single cycle property 
for n = 1 in case p odd, or for n = 2, otherwise. 
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The following important example is 

Bijectivity Criterion for Polynomials with Integer Coefficients (proved and 
re-proved by a number of authors; known since 1960 th ): 
The mapping 

x>—*f(x) (modp N ), 

where N > 2 and f is a polynomial with rational integer coefficients, is 
bijective if and only if x i— ► f{x) (mod p n ) is bijective for n = 2. 

Yet another one example: 

Quadratic Generator (Coveyou, 1969): 

The mapping 

xt-^f(x) (modp N ), 

where N > 3 and f is a quadratic polynomial with rational integer coeffi- 
cients, is a permutation with a single cycle property iff x \—* f(x) (mod p n ) 
is a permutation with a single cycle property for n = 3 in case p £ {2, 3} , 
or for n = 2, otherwise. 

It worth notice here that in 1980 th M. V. Larin proved that the word 
'quadratic' in the statement could be omitted] The result was spread as a 
manuscript that time, a journal publication [18] appeared much later. 

6.2. Explanation: p-adic derivations. Looking at the examples of the 
preceding subsection, we naturally start suspecting that some very strong 
reason for such behaviour must exist! The following theorem, which was 
published in 1993 [4, 3], gives an explanation: 

Theorem 6.1. Let a compatible function F : Z p — > Z p be uniformly differ- 
entiable modulo p 2 . Then F is ergodic if and only if it is transitive modulo 
pN 2 {F)+i j or Q^fl p r { me p or> respectively, modulo 2 N2 ^ +2 for p = 2. 

This theorem works for a much wider class of functions that the ones 
mentioned in the above examples. Actually, this class includes functions 
that are compositions of not exceptionally arithmetic operations, but of 
logical operations as well. To illustrate the techniques, consider the following 
example. 

Example 6.2. In their paper [15] of 2002 Klimov and Shamir write that 

...neither the invertibility nor the cycle structure of x+(x 2 \/5) 
could be determined by his (i.e., mine — V.A.) techniques. 

See however how it could be immediately done with the use of Theorem 6.1: 
The function f(x) = x + (x 2 V5) is uniformly differentiable on Z2, thus, it is 
uniformly differentiable modulo 4 (see 4.6 and an example thereafter), and 
N2(f) = 3. Indeed, (x + h) OR 5 = (x OR 5) + h whenever h = (mod 8) (the 
latter congruence is obvious since the base-2 expansion of 5 is ...000101). 

Now to prove that / is ergodic, in view of 6.1 it suffices to demonstrate 
that / induces a permutation with a single cycle on Z/32. Direct calculations 
show that the string 

0, /(0) mod 32, f 2 (0) mod 32 = /(/(0)) mod 32, ... , / 31 (0) mod 32 

is a permutation of the string 0, 1, 2, ... , 31, thus ending the proof. 
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In connection with Theorem 6.1, the following natural question arises: 
What about ergodicity in higher dimensions? Unfortunately, for uniformly 
differentiable modulo p function the answer is negative. The following re- 
sult could be considered as a non-existence theorem for compatible smooth 
ergodic mappings in higher dimensions. 

Theorem 6.3 (see [4, 3]). Let the function F = (f 1 , ...,/„) : Z™ -> Z™ 
be compatible, ergodic, and uniformly differentiable modulo p on Z p . Then 
n = 1. 

Note. Non-differentiable modp ones do exist for n > 1 

The following theorem, which uses derivations modulo p instead of p 2 , 
could be applied to construct balanced mappings to serve as output functions 
of PRNG. 

Theorem 6.4 (see [ ]). Let F: Z™ —* Z™ be a compatible function that is 
uniformly differentiable modulo p. Then F preserves measure whenever it 
is balanced modulo p k for some k > N\ (F) and the rank of its Jacobi matrix 
F[(u) modulo p is exactly m at all points u = (m, . . . ,u n ) G (Z/p fc ) n . 

Proof. For £ G (Z/p s ) m denote 

F s -\0 = {7 G (W)": *X7) = £ ( m °d P S )}- 
Let s > k > N\(F). Since -F is compatible, and hence F is a sum of a 
compatible function and a periodic function with period 

p Ni(F) ( gee 2.10 of 

[ ]), we conclude that if rj G then 77 G F s _1 (£). Here and further we 

denote via a = (a\, . . . , a m ) G (Z/p s ) m the residue modulo p s , a mod p s = 
(ai mod p s , . . . , a m mod where a = (ct\, . . . , a m ) G (Z/p s+1 ) m . 

Put A = 77 + G (Z/p s+1 ) n , where a G (Z/p) n . In view of the uniform 
differentiability of the function F modulo p (see 4.4), we have 

F(X) = F{rj) + p s aF[(fj) (mod p s+1 ). (6.4.1) 

Since F(fj) = £ + p k (3 (mod p s+1 ) and £ = £ + p s 7 for suitable /3, 7 G 
(Z/p)( m \ in view of (6.4.1) we conclude that A G ^"^(O if an d only if 
A G i ? s " 1 (£) (i-e., f\ G i ? ~ 1 (£)) and a satisfies the following system of linear 
equations over a finite field Z/p: 

(3 + aF{(fj) = 7. (6.4.2) 

Thus, if columns of the matrix F[(fj) are linearly independent over Z/p, 
then linear system (6.4.2) has exactly p n ~ m pairwise distinct solutions for 
arbitrary 0, 7 G (Z/p)^ m \ From here it follows that 

= \F-H0\p n - m - (6-4.3) 

Hence, if F is equiprobable modulo p k (i.e., if |i ? ~ 1 (£)| does not depend 
on £) and if rank of the matrix F[(rj) is m, then (6.4.3) implies that F is 
balanced modulo p s+1 . □ 

Corollary 6.5. Under assumptions of theorem 6.4: 

• If m = 1, then F is measure-preserving whenever F is balanced 
modulo p k for some k > N\(F), and the differential d\F modulo p 
of the function F vanishes at no point of {'L/p k \Z) n . 
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• Let f(xi, . . . , x ri ) be a polynomial in variables xi, . . . , x n , and let all 
coefficients of f are p-adic integers. The polynomial f preserves 
measure whenever it is balanced modulo p and all its partial deriva- 
tives vanishes simultaneously modulo p at no point of (Z/pZ) n (i.e., 
are simultaneously congruent to modulo p nowhere) on (Z/pZ) n . 

For m = n the above stated sufficient conditions of measure preservation 
becomes also necessary ones. 

Theorem 6.6. A compatible and uniformly differ entiable modulo p function 

F = (/l> ■ ■ ■ i fm) '■ %p —> %p 

preserves measure if and only if it is bijective modulo p Nl ^ and its Jacobian 
modulo p vanishes at no point of (Z/p Nl ^ F )Z) n (Equivalent condition: If and 
only if F is bijective modulo p Nl ^ +1 ). 

Proof. If F is bijective modulo p Nl<yF \ and if its Jacobian modulo p vanishes 
nowhere on {^L / p Nl ^) n , then in view of Theorem 6.4 F preserves measure. 

Vise versa, let F preserve measure, i.e., let F be bijective modulo p k 
for all k > N, where N is some positive rational integer. Now take k > 
maxjiV, N\(F)}, then the definition of uniform differentiability modulo p 
implies that 

F{u + p k a) = F(u) +p k aF[{u) (mod p k+l ) (6.6.1) 

for all u, a E Z p . Here F[{u) is an nxn matrix over a field Z/p. l£detF[(u) = 
(mod p) for some u £ Z™ (or, the same, for some u G {0, 1, . . . ,p Nl ( F > — l} n 
in view of the periodicity of partial derivatives modulo p), then there exists 
ae {0,1,... ,p-l} n ,a ^ (0,... ,0) (modp), such that aF[{u) = (0, . . . ,0) 
(mod p). But then (6.6.1) implies that F{u+p k a) = F(u) (mod p k+1 ). The 
latter contradicts the bijectivity modulo p k+1 of the function F, since for 
u £ {0,1,... ,p N ^ F "> - l} n we have u,u + p k a 6 {0, 1, . . . ,p k+l - l} n and 
u + p k a ^ u. 

Now we prove the criterion in the equivalent form. Let F be bijective 
modulo p Nl ^ F \ Then assuming k = N±(F) in the above argument, we 
conclude that det F[(u) ^ (mod p) for all tiGZ™. According to Theorem 
6.4, this implies that F preserves measure. 

Let F preserve measure, and let F be not bijective modulo p k for some 
k > Ni(F). We prove that in this case F is not bijective modulo p k+1 . 

Choose u, v £ {0, 1, . . . ,p k — l} n such that u ^ v F(u) = F(v) (mod p k ). 
Then either F(u) = F(v) (mod p k+l ) (i.e., F is not bijective modulo p k+1 ), 
or F(u) ^ F(v) (mod p k+1 ). Yet in the latter case we have F(u) = F(v) + 
p k a (mod p k+1 ) for some a £ {0,1, . . . ,p — l} n , a ^ (0, . . . , 0) (mod p). 
Consider u\ = u + p k (3, where G {0,1, . . . ,p — l} n with (3 ^ (0, . . . , 0) 
(mod p) and (3F[(u) + a = (0, . . . , 0) (mod p). Such (5 exists, since F pre- 
serves measure and, consequently, detF{(u) ^ (mod p), as this have been 
proven already. Now the definition of uniform differentiability modulo p 
implies that 

F(u+p k [3) = F{u)+p k (3F[(u) = F(v)+p k a+p k (3F[{u) = F(v) (mod p k+1 ), 

(6.6.2) 
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where u-\-p k j3 € {0, 1, . . . ,p k+1 — l}( n ) and u + p k a ^ v (since u ^ v). Thus 
(6.6.2) in combination with our assumption imply that F is not bijective 
modulo p k+1 . Applying this argument sufficient number of times, we con- 
clude that F is not bijective modulo p s for all s > k. But at the same time 
F preserves measure. A contradiction. □ 

Comparing theorems 6.4 and 6.6 one may ask whether sufficient conditions 
of theorem 6.4 are also necessary. The answer is negative: In [5] it is proved 
that the function f(x,y) = 2x + y 3 on Z2 provides a counter-example. 

Open question. Characterize all compatible measure-preserving mappings 

F = (fx, ■ ■ ■ > fm) '■ %p — ► %p 
with m < n. The answer is not known even under restriction that all fi are 
polynomials over Z p . 

The technique presented in this subsection is rather effective: Actually, 
all the examples of preceding subsection could be deduced from the results 
of this subsection. Moreover, all results of [15] also could be proved by these 
techniques. We re-prove these results to illustrate our techniques: 

Examples 6.7. The following is true: 

(1) A mapping 

(x, y) 1 ^ F(x, y) = {x® 2(x A y), (y + 3x 3 ) © x) mod 2 r 

of (Z/2 r ) 2 onto (Z/2 r ) 2 is bijective for all r = 1,2,... 

Indeed, the function F is bijective modulo 2^^) = 2 (direct 
verification) and det(F 1 '(u)) = 1 (mod 2) for all u € (Z/2) 2 (see 4.6 
and example thereafter). 

(2) The following mappings o/Z/2 r onto Z/2 r are bijective for all r = 
1,2,...: 

x i — ^ (x + 2x 2 ) mod2 r , 

x (x + (x 2 V 1)) mod2 r , 

x (x©(x 2 Vl)) mod2 r . 
Indeed, all three mappings are uniformly differentiable modulo 2, 
and N\ = 1 for all of them. So it suffices to prove that all three 
mappings are bijective modulo 2, i.e. as mappings of the residue 
ring Z/2 modulo 2 onto itself (this could be checked by direct cal- 
culations), and that their derivatives modulo 2 vanish at no point of 
Z/2. The latter also holds, since the derivatives are, respectively, 

l + 4x =1 (mod 2), 
l + 2x-l = 1 (mod 2), 
l + 2x-l = 1 (mod 2), 

since (x 2 V 1)' = 2x ■ 1 = 1 (mod 2), and (x © C)[ = 1 (mod 2), (see 
4.6). 

(3) The following closely related variants of the previous mappings of 
Z/2 r onto Z/2 r are not bijective for all r = 1, 2, . . .: 

(x + x 2 ) mod2 r , 
x i — > (x+(x 2 Al)) mod2 r , 
x h-> (x+(x 3 Vl)) mod2 r , 
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since they are compatible but not bijective modulo 2. 

(4) (see [21], also [15, Theorem 1]) Let P(x) = clq + a\x + • • • + adX d be 
a polynomial with integral coefficients. Then P(x) is a permutation 
polynomial (i.e., is bijective) modulo 2 n , n > 1 if and only if a\ is 
odd, (02 + 04 + • • • ) is even, and (a 3 + 05 + • • • ) is even. 

In view of 6.6 we must verify whether the two conditions hold: 
first, whether P is bijective modulo 2, and second, whether P'(z) = 1 
(mod 2) for z G {0, 1}. The first condition implies that P(0) = clq 
and .P(l) = ao + a% + 02 + • • • ay must be distinct modulo 2; hence 
a\ + 02 + ■ ■ ■ ad = 1 (mod 2). The second condition implies that 
P'(0) = a x = 1 (mod 2), P'(l) = Ql + a 3 + a 5 + • • • = 1 (mod 2). 
Now combining all this together we get 02 + a% + • • • ad = (mod 2) 
and 03 + (25 + • • • = (mod 2), hence 02 + + • • • = (mod 2). 

(5) As a bonus, we can use exactly the same proof to get exactly the 
same characterization of bijective modulo 2 r (r = 1, 2, . . .) mappings 
of the form x 1— > P{x) = ao © a\x © • • • © adX d mod 2 r since u © v 
is uniformly differentiable modulo 2 as bivariate function, and its 
derivative modulo 2 is exactly the same as the derivative of u + v, 
and besides, u © v = u + v (mod 2). 

Note that in general theorems 6.4 and 6.6 could be applied to a class 
of functions that is narrower than the class of all compatible functions. 
However, it turns out that for p = 2 this is not the case. Namely, the 
following proposition holds: 

Proposition 6.8. ([ , Corollary 4.6], [4, Corollary 4.4]) If a compatible 
function g: Z2 — > Z2 preserves measure then it is uniformly differentiable 
modulo 2, and its derivative modulo 2 is always 1 modulo 2. 

The above results are good to verify whether a given function preserves 
measure or is ergodic. However, we need more tools to construct measure- 
preserving, (respectively, ergodic) mappings in explicit form. 

6.3. Mahler's series. We already have mentioned that uniformly continu- 
ous functions defined on (and valuated in) Z p could be uniquely represented 
as Mahler's interpolation series (4.2.2). So, it is natural to express condi- 
tions of measure-preservation or ergodicity in terms of coefficients of these 
series. 

Theorem 6.9 ([3, 4, 5]). For p = 2 a function f : Z p — > Z p is compatible 
and measure-preserving if and only if it could be represented as 

00 , 
f( x)=CG + x + Y j c i p^v (x G Z p ); 

i=i \ l ' 

The function f is compatible and ergodic if and only if it could be represented 
as 

OO /X 

fix) = l + x + Y, c iP Li°gp(i+i)J+i ( x j (x G Zp), 
i=l 

where cq,c\,C2 ■ ■ ■ G Z p . For p ^ 2 these conditions remain sufficient, and 
not necessary. 
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Thus, in view of theorem 6.9 one can choose a state transition function to 
be a polynomial with rational (not necessarily integer) key-dependent coef- 
ficients setting a = for all but finite number of i. Note that to determine 
whether a given polynomial / with rational (and not necessarily integer) 
coefficients is integer valued (that is, maps Z p into itself), compatible and 
ergodic, it is sufficient to determine whether it induces a cycle on 0(deg/) 
integral points. To be more exact, the following proposition holds. 

Proposition 6.10 ([ ]). A polynomial f{x) with rational, and not neces- 
sarily integer coefficients, is integer valued, compatible, and ergodic {resp., 
measure preserving) if and only if 

z^ f(z) modp Llog p (dcg/)J+3 , 

where z runs through 0, 1, . . . J pL 1 °Sp( de e/)J+ 3 — \ } is compatible and transitive 
{resp., bijective) mapping of the residue ring Z / 'pLi°g P ( dc g /)J+ 3 onto itself. 

Theorem 6.9 enables one to use exponentiation in design of generators 
that are transitive modulo 2 n for all n = 1, 2, 3, 

Example 6.11. For any odd a = 1 + 2m a function f(x) = ax + a x defines a 
transitive modulo 2 n generator Xi + \ = f{xj) mod 2 n . 

Indeed, in view of 6.9 the function / defines a compatible and ergodic 
mapping of Z2 onto Z2 since f(x) = (1 + 2m)x + (1 + 2m) x = x + 2ms + 

£,=0 ™^ (i) = 1 + x + 4 ™(i) + EZ2 ™ i2i (;) and » > l>g 2 (* + 1)J + 1 for 
all % = 2,3,4,.... 

Such a generator could be of practical value since it uses not more than 
n + 1 multiplications modulo 2 n of n-bit numbers; of course, one should use 
calls to the table a 2J mod 2 n , j = 1, 2, 3, . . . , n — 1. The latter table must 
be precomputed, corresponding calculations involve n — 1 multiplications 
modulo 2 n . Obviously, one can use m as a long-term key, with the initial 
state xq being a short-term key, i.e., one changes m from time to time, but 
uses new xq for each new message. Obviously, without a properly chosen 
output function such a generator is not secure. The choice of output function 
in more details is discussed further. 

Note. A similar argument shows that for every prime p and every a = 1 
(mod p) the function f(x) = ax + a x defines a compatible and ergodic 
mapping of Z p onto itself. 

For polynomials with (rational or p-adic) integer coefficients theorem 6.9 
may be restated in the following form. 

Proposition 6.12 ([ , ]). Represent a polynomial f{x) G ^[x] in a basis 
of descending factorial powers 

x- =1, x- = x, x- = x(x — 1), ... , x- = x(x — 1) • • ■ (x — i + 1), . . . , 

i.e., let 

d 

for Co, ci, . . . , Cd E Z2. Then the polynomial f induces an ergodic (and, obvi- 
ously, a compatible) mapping 0/Z2 onto itself iff its coefficients cq,ci, 02,03 
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satisfy the following congruences: 

Co = 1 (mod 2), a = 1 (mod 4), c 2 = (mod 2), c 3 = (mod 4). 
The polynomial f induces a measure preserving mapping iff 

ci = l(mod2), c 2 = 0(mod2), c 3 = 0(mod2). 

Thus, to provide ergodicity of the polynomial mapping / it is necessary 
and sufficient to hold fixed 6 bits only, while the other bits of coefficients of / 
may vary (e.g., may be key-dependent). This guarantees transitivity of the 
state transition function z i— ► f(z) mod 2™ for each n, and hence, uniform 
distribution of the output sequence. 

Proposition 6.12 implies that the polynomial f(x) E Z[x] is ergodic (resp., 
measure preserving) iff it is transitive modulo 8 (resp., iff it is bijective 
modulo 4). A corresponding assertion holds in general case, for arbitrary 
prime p. 

Theorem 6.13 ([18, 5]). A polynomial f(x) E Z p [x] induces an ergodic 
mapping of 7L p onto itself iff it is transitive modulo p 2 for p ^ 2,3, or 
modulo p 3 , for p = 2,3. The polynomial f(x) E Z p [x] induces a measure 
preserving mapping of Z p onto itself iff it is bijective modulo p 2 . 

Example 6.14. The mapping x <— > f(x) = x+2x 2 (mod 2 32 ) (which is used in 
RC6, see [22]) is bijective, since it is bijective modulo 4: /(0) = (mod 4), 
/(l) = 3 (mod 4), f(2) = 2 (mod 4), /(3) = 1 (mod 4). Thus, the map- 
ping x i — ^ fix) = x + 2x 2 (mod 2") is bijective for all n = 1, 2, 

Hence, with the use of the theorem 6.13 it is possible to obtain transitive 
modulo q > 1 mappings for arbitrary natural q: one can just take f(z) = 
(1 + z + qg(z)) mod q, where g(x) E Z[x] is an arbitrary polynomial, and 
q is a product of p Sp for all prime factors p of q, where S2 = S3 = 3, and 
s p = 2 for p 7^ 2,3. Again, the polynomial g(x) may be chosen, roughly 
speaking, 'more or less at random', i.e., it may be key-dependent, but the 
output sequence will be uniformly distributed for any choice of g{x). This 
assertion may be generalized either. 

Proposition 6.15 ([5]). Let p be a prime, and let g(x) be an arbitrary 
composition of arithmetic operations and mappings listed in (4.2.1). Then 
the mapping z 1— > 1 + z + p 2 g{z) (z E Z p ) is ergodic. 

In fact, both propositions 6.12, 6.15 and theorem 6.13 are special cases of 
the following general 

Theorem 6.16 ([ ]). Let B p be a class of all functions defined by series 
of a form f{x) = X]£o c * ' x ~> w h ere co,c±, . . . are p-adic integers, and x- 
{i = 0, 1, 2, . . .) are descending factorial powers (see 6.12). Then the function 
f E Bp preserves measure iff it is bijective modulo p 2 ; f is ergodic iff it is 
transitive modulo p 2 {for p 7^ 2, 3), or modulo p s (for p E {2, 3}). 

Note. As it was shown in [5], the class B p contains all polynomial functions 
over Z p , as well as analytic (e.g., rational, entire) functions that are conver- 
gent everywhere on Z p . 10 As a matter of fact, every mapping that is a com- 
position of arithmetic operators (addition, subtraction, multiplication, and 



More information about this class could be found in [ ] 
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operators listed in (4.2.1)) belong to £> p ; thus, every such mapping modulo 
p n could be induced by a polynomial with rational integer coefficients (see 
the end of Section 4 in [ •■]). For instance, the mapping x t— > (3x + 3 x ) mod 2 n 
(which is transitive modulo 2 n , see 6.11) could be induced by a polynomial 
1 + x + 4 (*) + YZ£ V ( i) =l + 5x + Y,i=2 $ • ^ — just note that a = f are 
2-adic integers since the exponent of maximal power of 2 that is a factor of i\ 
is exactly i — wt2 i, where wt2 i is a number of l's in the base-2 expansion of 
i (see e.g. [] 7, Chapter 1, Section 2, Exercise 12]); thus \\ci\\ 2 = 2~ wt2i < 1, 
i.e. Cj G %2 and so Cj mod 2 n G Z. 

Theorem 6.16 implies that, for instance, the state transition function 
f(z) = (1 + z + ((q) 2 (l + ((q)u(z)) v ^) mod q is transitive modulo q for each 
natural q > 1 and arbitrary polynomials u(x),t>(x) G Z[x], where ((q) is a 
product of all prime factors of q. So the one can choose as a state transition 
function not only polynomial functions, but also rational functions, as well 
as analytic ones. It should be mentioned, however, that this is merely a 
form the function is represented (which could be suitable for some cases 
and unsuitable for the others), yet, for a given q, all the functions of this 
type may also be represented as polynomials over Z (see [5, Proposition 4.4; 
resp., Proposition 4.10 in the preprint]). For instance, certain generators 
of inversive kind (i.e., those using taking the inverse modulo 2 n ) could be 
considered in such manner. 

Example 6.17. For f(x) = — 2 ^ +1 — x a generator = f(xi) mod 2 n is 
transitive. Indeed, the function f(x) = (— 1 + 2x — Ax 2 + 8a; 3 — ■■■) — x = 
— 1 + x — Ax 2 + 8(- • • ) is analytic and defined everywhere on Z2; thus / G B p . 
Now the conclusion follows in view of 6.16 since by direct calculations it 
could be easily verified that the function f(x) = — 1 + x — Ax 2 (mod 8) is 
transitive modulo 8. Note that modulo 2 n the mapping x 1— » f(x) mod 2 n 
could be induced by a polynomial — l + x — 4x 2 + 8x 3 + - • • + (— l) n 2 n ~ 1 x n ~ 1 . 

6.4. Explicit expressions. It turns out that there is an easy way to con- 
struct a measure preserving or ergodic mapping out of an arbitrary com- 
patible mapping, i.e., out of an arbitrary composition of both arithmetic 
(including (4.2.1)) and bitwise logical operators. 

Theorem 6.18 ([ ]). Let A be a difference operator, i.e., Ag(x) = g(x + 
1) — g(x) by the definition. Let, further, p be a prime, let c be a coprime 
with p, gcd(c, p) = 1, and let g: Z„ — ► Z p be a compatible mapping. Then 
the mapping z 1— ► c + z + pAg(z) (z G Z p ) is ergodic, and the mapping 
z d + cx + pg(x), preserves measure for arbitrary d. 

Moreover, if p = 2, then the converse also holds: Each compatible and 
ergodic {respectively each compatible and measure preserving ) mapping z 1— ► 
f(z) (z G Z2) could be represented as f(x) = 1 + x + 2Ag(x) (respectively 
as f(x) = d + x + 2g(x)) for suitable d G Z2 and compatible g: Z2 — > Z2. 

Note. The case p = 2 is the only case the converse of the first assertion of 
theorem 6.18 holds. 

Proof. To start with, by induction on I we show that g is bijective modulo 
p l for all I = 1, 2, 3, . . . . The assumption is obviously true for 1 = 1. 
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Assume it is true for I = 1, 2, . . . , k — 1. Prove that it holds for / = k 
either. Let g(a) = g(b) (mod p k ) for some p-adic integers a, b. Then a = b 
(mod p k ~ 1 ) by the induction hypothesis. Hence pv(a) = pv(b) (mod p k ) 
since v is compatible. Further, the congruence g(a) = g{b) (mod p k ) implies 
that ca + pv(a) = cb + pv{b) (mod p k ), and consequently, ca = cb (mod p k ). 
Since c ^ (mod p), the latter congruence implies that a = b (mod p k ), 
proving the first assertion of the lemma. 

To prove the rest part of the first assertion we note that the just proven 
claim implies that h preserves measure. To prove the transitivity of h modulo 
p k for all k = 1, 2, 3, . . . we apply induction on k once again. 

It is obvious that h is transitive modulo p. Assume that h is transitive 
modulo p k ~ l . Then, since h induces a permutation on the residue ring 
Z/p k Z and since h is a compatible function, we conclude that the length of 
each cycle of this permutation must be a multiple of p k ~ 1 . Thus, to prove 
this permutation is single cycle it suffices to prove that the function 

hP k ~\x) = h{h jh [x)) . . .) 

pk — l 

induces a single cycle permutation on the ideal p fc_1 Z, generated by the ele- 
ment p k ~ 1 of the ring Z/p fc Z. In other words, it is sufficient to demonstrate 
that the function pr=jh pk 1 (p k ~ 1 x) is transitive modulo p. 

Applying obvious direct calculations, we successively obtain that 

h}(x) = c + x + pv(x + 1) — pv(x), 
hP{x) = h{h j ~ l {x)) = cj + hP' x {x) +pv(h j - 1 {x) + 1) -pv(h j ~\x)) = 

3-1 3-1 

cj + X + p ^vih^x) + 1) -p^vih^x)), 

i=0 i=0 

and henceforth. We recall that h°(x) = x by the definition. So, 

h pk ~\x) = cp k - 1 + x + p v(h\x) + l)-p v ( h \ x ))- (6-18.1) 

i=0 i=0 

Since h is transitive modulo p k ~i and compatible, we get now that 
pfc-i_l P k -~ L -i P k -~ L -i 

v(h i (x) + l)= Y, v{K l {x))= Y < z ) (mod/" 1 ), 

i=0 i=0 2=0 

and (6.18.1) implies then h pk 1 (x) = cp k ~ 1 + x (modp k ). But c ^ 
(mod p), so we conclude that the function cp k ~ 1 + x induces on the ideal 
p k ^ lr L a single cycle permutation, thus proving the first assertion of the 
theorem. 

To prove the second assertion, note that as g is compatible, its Mahler's 
interpolation series are of the form of Theorem 4.3; noe note that A(^) = 
and apply Theorem 6.9. □ 
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Example 6.19. Theorem 6.18 immediately implies Theorem 2 of [15]: For any 
composition / of primitive functions, the mapping x i— > x + 2f(x) (mod 2 n ) 
is invertible — just note that a composition of primitive functions is corn- 



Theorem 6.18 is maybe one of the most important tools in design of 
pseudorandom generators such that both their state transition functions and 
output functions are key-dependent. The corresponding schemes are rather 
flexible: In fact, one may use nearly arbitrary composition of arithmetic and 
logical operators to produce a strictly uniformly distributed sequence: Both 
for g{x) = xX0R(2x + 1) and for 



a sequence {x{\ defined by recurrence relation Xi + \ = (1 + X{ + 2(g(xi + 
1) — g(xi))) mod 2 n is strictly uniformly distributed in Z/2 n Z for each n = 
1, 2, 3 . . ., i.e., the sequence {x{} is purely periodic with period length exactly 
2 n , and each element of {0, 1, . . . , 2 n — 1} occurs at the period exactly once. 
We will demonstrate further that a designer could vary the function g in 
a very wide scope without worsening prescribed values of some important 
indicators of security. In fact, choosing the proper arithmetic and bitwise 
logical operators the designer is restricted only by desirable performance, 
since any compatible ergodic mapping could be produced in this way: 

Corollary 6.20. Let p = 2, and let f be a compatible and ergodic mapping of 
Hj2 onto itself. Then for each n = 1, 2, . . . the state transition function f mod 
2 n could be represented as a finite composition of arithmetic and bitwise 
logical operators. 

Proof. In view of proposition 6.18 it is sufficient to prove that for arbitrary 
compatible g the function g = g mod 2 n could be represented as a finite 
composition of operators mentioned in the statement. In view of Definition 
2.1, one could represent g as 

g(x) = 7o(xo) + 271 (xo, Xi) H H 2 n ~ 1 7n-i(xo, • • • , Xn-i), 

where 7; = Si(g), Xi = i = 0, 1, . . . , n - 1. Since each 7i(xo, ■■■,Xi) 

is a Boolean function in Boolean variables X0j ■ ■ ■ ? Xi: ^ could be expressed 
via finite number of xors and ands of these variables \0: ■ ■ ■ >Xi- Yet each 
variable Xj could be expressed as Xj = $j( x ) = x and(2- ? ), and the conclusion 
follows. □ 

6.5. Using Boolean representations. As we just have seen, in case p = 2 
we have two equivalent descriptions of the class of all compatible ergodic 
mappings, namely, theorems 6.9 and 6.18. They enable one to express any 
compatible and transitive modulo 2 n state transition function either as a 
polynomial of special kind over a field Q of rational numbers, or as a spe- 
cial composition of arithmetic and bitwise logical operations. Both these 
representations are suitable for programming, since they involve only stan- 
dard machine instructions. However, we need one more representation, in 
a Boolean form, which we have already used in the definition of T-function 



patible (see [15] for the definition of primitive functions). 



□ 
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(see Definition 2.1). Despite this representation is not very convenient for 
programming, it could be used to prove the ergodicity of some simple map- 
pings, see e.g. 6.22 below. The following theorem is just a restatement in our 
terms of a known (at least 30 years old) result from the theory of Boolean 
functions, the so-called bijectivity /transitivity criterion for triangle Boolean 
mappings. However, the latter result is a mathematical folklore, and thus it 
is somewhat difficult to attribute it. 

Recall that the algebraic normal form, ANF, of the Boolean function 
V'i(XO) • • • > Xj) is the representation of this function via © (addition modulo 
2, that is, logical 'exclusive or') and • (multiplication modulo 2, that is, 
logical 'and', or conjunction). In other words, the ANF of the Boolean 
function ip is its representation in the form 

VKxo, ...,xj) = P® A)Xo e Pixi © • • • © /Vxoxi © • • • , 

where (3, (3q, . . . S {0, 1}. The ANF is sometimes called a Boolean polynomial. 

Recall that the weight of the Boolean function ipj in (j + 1) variables is the 
number of (j + l)-bit words that satisfy ipj] that is, weight is the cardinality 
of the truth set of tpj. 

Theorem 6.21. A mapping T: 7L<i — > %2 is compatible and measure pre- 
serving iff for each i = 0, 1, . . . the ANF of the Boolean function tJ = 5i(T) 
in Boolean variables xch • • • , Xi ^ s 

t?(xo, ■ ■ ■ , Xi) = Xi © <Pi(Xo, • • • , Xi-i), 

where (ff is an ANF. The mapping T is compatible and ergodic iff, ad- 
ditionally, the Boolean function ipf is of odd weight, that is, takes value 
1 exactly at the odd number of points (eo> • • • , where Ej G {0,1} for 

j = 0, 1, . . . , i — 1. The latter takes place if and only if ip^ = I, and the de- 
gree of the ANF ipf for i > 1 is exactly i, that is, ipf contains a monomial 
Xo---Xi-i- 

Proof. Represent the value of the function T at the 2-adic integer point 
x = xo + Xi " 2 + X2 • 2 2 + • • • as a 2-adic integer: 

oo 

T(xo + Xi • 2 + X2 • 2 2 + • • • ) = J2 ■ 2\ 

i=0 

The function T is compatible (that is, a T- function) if and only if 5i(x) 
does not depend on Xi+ijXi+2, ... for every i = 0, 1, 2, . . ., see Definition 2.1. 
Thus, each 5i(x) is a Boolean function tJ in Boolean variables X0i Xl-> ■ ■ ■ ! Xi- 
Re-write the ANF of the function tJ in the following form: 

rf (Xo, • • • , Xi) = Xi ■ 4>T(xo, • • • , Xi-i) © Vi (Xo, • • • , Xi-i), 

where both ipf(xo, ■ ■ ■ , Xi-i) an d fJ(xo, ■ ■ ■ , Xi-i) are Boolean functions in 
Boolean variables xo> • • • > X«-i- 

Obviously, whenever all ipf(xo, ■ ■ ■ , Xi-i) are identically 1, the function is 
measure-preserving since it is bijective modulo 2 fc+1 for each k = 0, 1,2, . . .: 
To find a co-image of the mapping T mod 2 k one must solve a system of 
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Boolean equations 

Xl + PiUl) ="i> 



Xo + = Q o> 



which has a unique solution given any azo, . . . , ot}. G {0, 1}. 

Conversely, in let i be the smallest number such that tpi(xo, ■ ■ ■ , Xi-i) = 
for a certain set Xo, ■ ■ ■ , Xi-l °f zeros and ones. Then 

T( Xo +Xr2+- • • X2-2 l - 1 +0-2 4 ) = T(xo+Xr2+- • • X2-2 i ~ 1 +l-2 4 ) (mod 2 i+1 ). 

Thus, T can not be measure-preserving in view of Theorem 5.2. 

Further, to prove the ergodicity part of the statement we note that T is 
transitive modulo 2 if and only if Tq(xo) = Xo © 1- I n case T is transitive 
modulo 2 k , 

5i(T )(x) = i . 

[Xfc©0", lfl = K, 

where a is a sum modulo 2 of all values of the Boolean function ip£ at all 
points of M k ; that is, a is the weight modulo 2 of the function ip^. Clearly, to 
provide transitivity of the function T modulo 2 + , (cf. Theorem 5.2 must 
be cr = 1. That is, weight of the function ipT must be odd. 

The rest of the statement of the theorem is a well-known result in the 
theory of Boolean functions; the proof is left to a reader. □ 

Note. The bit-slice techniques of Klimov and Shamir, which they introduced 
in 2002 in [15] is just a re-statement of the above stated folklore theorem 
6.21. 

This is how Theorem 6.21 works: 

Example 6.22. With the use of 6.21 it is possible to give another proof of the 
main result of [15] , namely, of Theorem 3: The mapping f(x) = x + (x 2 V C) 
over n-bit words is invertible if and only if the least significant bit of C is 1. 
For n > 3 it is a permutation with a single cycle if and only if both the least 
significant bit and the third least significant bit of C are 1 . 

Proof of theorem 3 of [15]. Recall that for x £ 7*2 and % = 0, 1, 2, ... we 
denote Xi = $i( x ) £ {0,1}; also we denote q = 6i(C). We will calculate 
5i(x + (x 2 VC)) as an ANF in Boolean variables Xo>Xl>-- - and we start with 
the following easy claims: 

• 5 (x 2 ) = xo, h{x 2 ) = 0, 5 2 {x 2 ) = xoXi © Xi, 

• 5 n (x 2 ) = Xn-iXo © ipn(xo, ■ ■ -,Xn-2) for all n > 3, where ip n is a 
Boolean function in n — 1 Boolean variables Xo, ■ ■ ■ , Xn-2- 

The first of these claims could be easily verified by direct calculations. 
To prove the second one represent x = x n -\ + 2 n_1 s n _i (where we recall 
x n _i = x mod 2 n ~ l ) and calculate x 2 = {x n -l + 2 n_1 s n __i) 2 = x 2 n _ x + 
2 n s n _ix n _i + 2 2n ~ 2 4_i = x 2 n _ 1 + 2 n Xn-iXo (mod 2 n+1 ) for n > 3 and note 
that x~n_i depends only on xo, ■ ■ ■ > Xn-2- 

This gives 
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(1) 5o(x 2 V C) = Xo©c ©Xoco 

(2) 5i(x 2 V C) = ci 

(3) 5 2 (x 2 V C) = xoXi © Xi © c 2 c 2 Xi © c 2 XoXi 

(4) <5„(x 2 V C) = Xn-lXO © V>n © C„ © C„Xn-lX0 © CnV'n for Tl > 3 

From here it follows that if n > 3, then 5 n (x 2 V C) = A„(xo> • • • > Xn-i)) an d 
deg A n < n — 1, since ^ n depends only on, may be, xo> • • • > Xn-2- 

Now successively calculate 7„ = 5 n (x + (x 2 V C)) for n = 0, 1, 2, . . .. We 
have 5q{x + {x 2 V C)) = Co © XoQd so necessarily Co = 1 since otherwise 
/ is not bijective modulo 2. Proceeding further with cq = 1 we obtain 
5\{x + (x 2 V C)) = ci © xo © Xii since xi is a carry. Then ^(x + (x 2 V 
C)) = (ciXo © cixi © XoXi) © (XoXl © Xi © C2 © c 2 xi © C2X0X1) © X2 = 
ciXo © cixi © Xi © C2 © c 2 xi © C2X0X1 © X2, here C1X0 © C1X1 © X0X1 is a 
carry. From here in view of 6.21 we immediately have C2 = 1 since otherwise 
/ is not transitive modulo 8. Now for n > 3 one has j n = a n + A„ © Xn, 
where a n is a carry, and a„+i = a n X n © a„Xn © A„Xn- But if C2 = 1 
then dega 3 = deg(//i^ © X2/^ © X2^) = 3, where /i = C1X0 © C1X1 © X0X1, 
v = (X0X1 ©Xi © c 2 ©C2Xi © c 2XoXi) = 0- This implies inductively in view of 
(4) above that dega„+i = n + 1 and that j n+ i = Xn+i © £n+i(xo, • • • , Xn), 
deg^ n+ i = n + 1. So the conditions of 6.21 are satisfied, thus finishing the 
proof of theorem 3 of [15]. □ 

There are some more applications of Theorem 6.21. 

Proposition 6.23. Let F: Z^ +1 -^1^2 be a compatible mapping such that 
for all zi, . . . , z n G Z2 the mapping F(x, Zi, ... , z n ) : Z2 — ► Z2 is mea- 
sure preserving. Then F(f(x), 2gi(x), . . . , 2g n (x)) preserves measure for all 
compatible gx, . . . , g n : 7L<i — > Z2 and all compatible and measure preserving 
f : 7*2 ^2- Moreover, if f is ergodic then f(x + 4g(x)), f(x © (4g(x))) ; 
/(x) +4g(x), and f(x) © (4g(x)) are ergodic for any compatible g: Z2 — > Z2 
(here © stands for xor). 

Proof. Try to prove this yourself! □ 

Example 6.24. With the use of 6.23 it is possible to construct very fast 
generators Xj + i = /(xj) mod 2 n that are transitive modulo 2 n . For instance, 
take 

f(x) = (... ((((x + c ) © d ) + ci) © di) + • • • + c m ) © d m , 
where Co = 1 (mod 2), and the rest of Cj,dj are modulo 4. By the way, 
this generator, looking somewhat 'linear', is as a rule rather 'nonlinear': the 
corresponding polynomial over Q is of high degree. The general case of these 
functions / (for arbitrary Cj, di) was studied by the author's student Ludmila 
Kotomina: She proved that such a function is ergodic iff it is transitive 
modulo 4. 

Yet another application of Theorem 6.21 are multivariate single cycle 
T- functions. We already know that there are no such functions among uni- 
formly differentiable modulo 2 functions, see Theorem 6.3. However, the 
non-differentiable modulo 2 multivariate ergodic functions on 7Li exist. 

In 2004 Klimov and Shamir introduced a multivariate T-function H with 
a single cycle property. The m-variate mapping 

H: (Icq, ~x 1, . . . , ~x m -i) i-> (ho, hi,..., 7i m _i) 



NON-ARCHIMEDEAN ANALYSIS, T-FUNCTIONS, AND CRYPTOGRAPHY 35 
over ?7.— bit words X q, X l, . . . , X rn~\ 5 

defined by 

^ = ij ((M~^ A • • • A ~x m -\)® 

(~r A • • • A m-l)) A ~x A • • • A 

s = 0,1,... ,m — l, has a single cycle property whenever /i is a univariate T- 
function with a single cycle property. Here A stands for and, bitwise logical 
'and' (a conjunction). We assume that a bitwise conjunction over an empty 
set of indices is a string of all l's. 

Actually, this is just a trick: The m-variate mapping H on n-bit words is 
a multivariate representation of a univariate T- function over mn-bit words. 
Indeed, given a univariate T- function F, 

F 

x = {.. . ,x2,xi,xo) >-> (• • • ;V'2(xo,xi,X2);V>i(xo,xi);V'o(xo)), 

arrange this mapping in columns of height m, this way: 

fo 

•■■X2m Xm X0 ...l/)2 m (x) Ipmix) fo{x) 

•••X2m+1 Xm+1 Xl ^ •••V'2m+l(^) ^m+l(^ ^lfa) 

•••X3m-1 X2m-1 Xm-1 ^ •••V'3m-l( a ^2m-l(») ^m-lfa) 
Now just assume the left-hand rows are new variables: 

~Xj = (' • • iX2m+j,Xm+j,Xj), (j = 0, l,...,m- 1). 

Obviously, the m-variate mapping F = (/o, fx, ... , f m -i) has a single cycle 
property iff a univariate mapping i* 1 has a single property. 
Consider the simplest example: F(x) = 1 + x. We have 

i-i 

5j(F(x)) = 5j(x) + JJ 5 fl (x) (mod 2) 

s=0 

(we assume the product over the empty set is 1); then the m-variate repre- 
sentation F = (fo, fi, . . . , f m -i) of this mapping is 

• fc— 1 \ / m—l 

fki~XQ, ■ ■ ■ ,~Xm-l) = ~Xk 



Xk 



A ^) A ( A ^ r + ® = 

A**) A ((( A 1 ^) + i) e ( A^r)))- 



With the use of this trick and with Theorem 6.21 the following multivari- 
ate ergodic T-functions could be constructed: 



Proposition 6.25 ([7]). Let t, j G {0, 1, . . . , m - 1}, let all ff> (reap., gf) 
be univariate ergodic (resp, measure-preserving) compatible mappings from 
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%2 onto Z2. Then the mapping F(x) = (/o(x), . . . , / m _i(x)) 

(m— 1 
r=0 

/i(x) = ^1 ffl ( ^(^o) a 1 /\ (/; r) (i? r ) © 



(/ m—2 \ / m—1 

( A Sm-lO?*)) A ( /\ (tiU^r) © ^r) 
^ t=0 ' ^ r=0 

where x = (lt?o, . . . , a? m _i), EE G {+>©}; is a compatible and ergodic map- 
ping ofL™ on t° ^2 l - 

7. Wreath products of PRNGs 

In the preceding section we have developed some tools that enable us 
to construct algorithms based on standard instructions of an n-bit word 
processor that produce strictly uniformly distributed sequences of period 
length 2 n . 

To judge whether these sequences could be of use for stream encryption 
we must study their properties that are crucial for stream ciphers. One 
of these properties is long period. But is the period of are sequences long 
enough? Not yet! In case n = 32, which is a standard for most contemporary 
processors, we obtain a period of length 2 32 , which is too small to satisfy 
contemporary safety conditions: At least some 2 80 is needed. Thus, we must 
make the period longer leaving the sequence uniformly distributed. In this 
section we consider corresponding techniques. 

7.1. What is wreath product. We start with a formal definition: 

Definition 7.1. Given a mapping U: Z — > Z, and a set of mappings V = 
{(V z : X — > X) : z G Z}, a wreath product (or, a skew product or, a skew 
shift) is a mapping 

UAV: (z,x)^(U(z),V z (x)) 
of the Cartesian product Z x X into itself. 

In other words, the wreath product is a bivariate mapping where the first 
coordinate is a function of the variable z only, and the second coordinate is 
a bivariate function of z and x. 

Most probably, you are already familiar with examples of wreath products; 
recall Feistel network: The mapping it is based on is (z,x) ^ (z, z © f(x)), 
where z,x G B n , /: B n — > B n , which is obviously a wreath product of 
U(z) = z with V = {V z (x) =z(B f(x) : z G B n }. 

Obviously, the wreath product U XV is bijective whenever 
both U and all V z are bijective. 

Some terminology notes: In automata theory (and in algebra) they used 
to speak of wreath products, whereas in dynamical systems (and in ergodic 
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theory) theory they prefer the term skew product, or skew shift. Recall that 
ordinary PRNG corresponds to an autonomous dynamical system. 

This is a non- autonomous dynamical system, which is a counterpart of a 
counter-dependent PRNG 2.0.1 in dynamics: A non-autonomous dynamical 
system is a dynamical system driven by another dynamical system, and skew 
products are used to combine two dynamical systems into a new one. 

Note that a T-function is a composition of wreath products: Let F be a 
T-function, 

(xo,xi,X2, •••) (V ? o(xo); , 0i(xo,xi);^2(xo,xi,X2); • • ■), 

then 

Xo •->■ ^o(xo) 
(xo,Xi) ^ (V ; o(xo),V ; i(xo,Xi)) 
((xo,Xi),X2) i-> ((V'o(xo),^i(xo,Xi)),V'2(xo,Xi,X2)) 



Now we re-state the above definition for the case of wreath products of 
automata: 

Definition 7.2. Let 2lj = (N,M, fj,Fj) be a family of automata with the 
same state set N and the same output alphabet M indexed by elements 
of a non-empty (possibly, countably infinite) set J (members of the family 
need not be necessarily pairwise distinct). Let T: J — > J be an arbitrary 
mapping. A wreath product of the family {2lj} of automata with respect to 
the mapping T is an automaton with the state set N x J, state transition 
function f(j,z) = (fj(z),T(j)) and output function F(j,z) = Fj(z). We 
call fj (resp., Fj) clock state update (resp., output) functions. 

Obviously, the state transition function f(j, z) = (fj(z),T(j)) is a wreath 
product of a family of mappings {fj : j E J} with respect to the mapping T 

It worth notice here that if J = No and Fi does not depend on i, this con- 
struction gives us a number of examples of counter-dependent generators in 
the sense of [23, Definition 2.4], where the notion of a counter-dependent gen- 
erator was originally introduced. However, we use this notion in a broader 
sense in comparison with that of [23]: In our counter-dependent generators 
not only the state transition function, but also the output function depends 
on i. Moreover, in [23] only a special case of counter-dependent genera- 
tors is studied; namely, counter-assisted generators and their cascaded and 
two-step modifications. A state transition function of a counter-assisted 
generator is of the form fi(x) = i ★ h(x), where * is a binary quasigroup 
operation (in particular, group operation, e.g., + or xor), and h{x) does not 
depend on i. An output function of a counter-assisted generator does not 
depend on i either. 

7.2. Constructions. In this subsection we introduce a method to construct 
counter dependent pseudorandom generators out of ergodic and measure- 
preserving mappings. The method guarantees that output sequences of these 
generators are always strictly uniformly distributed. Actually, all these con- 
structions are wreath products of automata in the sense of 7.2; the following 
results give us conditions these automata should satisfy to produce a uni- 
formly distributed output sequence. Our main technical tool is the following 
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theorem, which actually could be considered as a generalization of Theorem 
6.21: 

Theorem 7.3 ([ ]). Let Q = go, . . . ,g m -i be a finite sequence of compatible 
measure preserving mappings of Z2 onto itself such that 

(1) the sequence {{gi mod m(0)) mod 2 : i = 0, 1, 2, . . .} is purely periodic, 
its shortest period is of length m; 

(2) EK'aW^l (mod 2); 

(3) E?=o Eto 1 9j(z) = 2 k (mod 2*+*) for all k = 1,2, ... . 
Then the recurrence sequence Z defined by the relation xiJ r \ — Q{ mod m 

is strictly uniformly distributed modulo 2 n for all n = 1,2,... : That is, 
modulo each 2 n the sequence Z is purely periodic, its shortest period is of 
length 2 n m, and each element o/Z/2 n Z occurs at the period exactly m times. 

Note. In view of 6.21 condition (3) of theorem 7.3 could be replaced by the 
equivalent condition 

m— 1 

Coefo,...,*-!^) = 1 (mod 2) (k = 1, 2, . . .), 

j=0 

where Coefo i ... i fc-i( < / ; ') * s a coefficient of the monomial xo " " " Xfc-i i n ANF ip. 

It turns out that the sequence Z of 7.3 is just the sequence y of the 
following 

Lemma 7.4 ([ ]). Let cq, . . . , c m _i be a finite sequence of 2-adic integers, 
and let go, . . . ,g m -\ be a finite sequence of compatible mappings of %i onto 
itself such that 

(i) gj(x) = x + Cj (mod 2) for j = 0, 1, . . . ,m — 1, 

(ii) Y!?=o c j = 1 ( mod 2 ); 

(iii) the sequence {q mo d m mod 2 : z = 0, 1, 2, . . .} is purely periodic, its 
shortest period is of length m, 

(iv) S k (gj(z)) = Ck + <p{((o, • • • , Cfe-i) ( m °d 2), k = 1,2,..., where ( r = 
S r (z),r = 0,1,2,..., 

(v) for each k = 1,2, . . . an odd number of ANFs (pi in Boolean variables 
Co, • • • , Cfc-l are °f °dd weight. 

Then the recurrence sequence y = {xi G Z2} defined by the relation X{+\ = 
9i mod m{xi) is strictly uniformly distributed: It is purely periodic modulo 2 k 
for all k = 1,2,...; its shortest period is of length 2 k m; each element of 
Z/2 fc Z occurs at the period exactly m times. Moreover, 

(1) the sequence T> s = {5 s (xi) : % = 0, 1, 2, . . .} is purely periodic; it has 
a period of length 2 s+1 m, 

(2) 5 s {x i+ 2"m) = 8 s (xi) + 1 (mod 2) for all s = 0, 1, . . . , k - 1, i = 
0,1,2,..., 

(3) for each t = 1, 2, . . . , k and each r = 0, 1, 2, . . . the sequence 

x r mod 2*, x r+m mod 2 t ,x r+ 2 m mod 2*, . . . 

is purely periodic, its shortest period is of length 2*, each element of 
Z/2*Z occurs at the period exactly once. 
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Note 7.5. Assuming m = 1 in 7.3 one obtains ergodicity criterion 6.21. 

Corollary 7.6 ([6]). Let a finite sequence of mappings {go, . . . , g m -i) 0/Z2 
into itself satisfy conditions of theorem 7.3, and let {Fq, . . . ,_F m _i} be an 
arbitrary finite sequence of balanced (and not necessarily compatible) map- 
pings of Z/2 n Z (n > 1) onto r L/2 kr L, 1 < k < n. Then the sequence 
T = {Fi m od m(xi) : i = 0,1,2...}, where x i+ i = g { mod m (xi) mod 2 n , is 
strictly uniformly distributed over Z/2 fc Z: It is purely periodic with a pe- 
riod of length 2 n m, and each element ofL/2 k 1i occurs at the period exactly 
2 n ~ k m times. 

Theorem 7.3 and lemma 7.4 together with corollary 7.6 enables one to 
construct a counter-dependent generator out of the following components: 

• A sequence Co, . . . , c m _i of integers, which we call a control sequence. 

• A sequence ho, ... , of compatible mappings, which is used to 
form a sequence of clock state update functions gi 

• A sequence Hq, . . . , H m _\ of compatible mappings to produce clock 
output functions Fi 

Note that ergodic functions that are needed could be produced out of com- 
patible ones with the use of 6.18 or 6.23. A control sequence could be 
produced by an external generator (which in turn could be a generator of 
the kind considered in this course), or it could be just a queue the state 
update and output functions are called from a look-up table. The functions 
hi and/or Hi could be either precomputed to arrange that look-up table, or 
they could be produced on-the-fly in a form that is determined by a control 
sequence. This form may also look 'crazy', e.g., 

hi(x) = (■■■ {(u (5 (ci)) 05i(c i ),5 2 (c i ) Mfoici))) Os 4 (a)Ma) "2(^6 (<*))) 

(7.6.1) 

where Uj(0) = x, the variable, and Uj(l) is a constant (which is determined 
by Q, or is read from a precomputed look-up table, etc.), while (say) Oo,o = 
+, an integer addition, Oi,o = •, an integer multiplication, Oo,i = X0R > 
Oi,i = AND- This is absolutely no matter what these hi and Hi look like 
or how they are obtained, the above stated results give a general method 
to combine all the data together to produce a uniformly distributed output 
sequence of a maximum period length. 

Examples 7.7 ([ ]). A basic circuit illustrating these example wreath prod- 
ucts is given at Figure 3. 

(1) Let Co, . . . , c m _i be an arbitrary sequence of length m = 2 s , and 
let ho, . . . , /i m _i be arbitrary compatible mappings. For < j < 
m — 1 put hj(x) = 1 + x + 4 • hj(x) and let gj(x) = Cj + hj(x). 
These mappings gj satisfy conditions of theorem 7.3 if and only if 

H*2o lc j = 1 ( mod 2). 

(2) For m > 1 odd let {ho, ■ ■ ■ , h m -i} be a finite sequence of compat- 
ible and ergodic mappings; let cq, . . . , c m _i be a finite sequence of 
integers such that 

• ^j=o c 3 = (mod 2), and 

• the sequence mo d m mod 2 : i = 0, 1, 2, . . .} is purely periodic 
with the shortest period of length m. 
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Figure 3. Wreath product basic circuit for Examples 7.7. 

Put gj(x) = Cj © hj(x) (respectively, gj(x) = Cj + hj(x)). Then gj 
satisfy conditions of 7.3. 
(3) The conditions of (2) are satisfied in case ui — 2 s — 1 and Co, • • • , c m —\ 
is the output sequence of a maximum period linear feedback shift 
register over Z/2Z with s cells. 



In this section we study a structure and statistical properties of output 
sequences of wreath products of automata, that is, sequences described by 
Theorem 7.3. Note that in view of 7.5, all the results of this section remain 
true for compatible mappings T: Z2 —* Z2 (i.e., for T-functions) either. 

8.1. Distribution of /c-tuples. The output sequence Z of any wreath 
product of automata that satisfy 7.3 is strictly uniformly distributed as 
a sequence over Z/2 n Z for all n. That is, each sequence Z n of residues 
modulo 2 n of terms of the sequence Z is purely periodic, and each element 
of Z/2 n Z occurs at the period the same number of times. However, when 
this sequence Z n is used as a key-stream, that is, as a binary sequence Z' n 
obtained by a concatenation of successive n-bit words of Z, it is important 
to know how n-tuples are distributed in this binary sequence. Yet strict 
uniform distribution of an arbitrary sequence T as a sequence over Z/2 n Z 
does not necessarily imply uniform distribution of n-tuples, if this sequence 
is considered as a binary sequence T . 

For instance, let T = 0132013201321.... This sequence is strictly uni- 
formly distributed over Z/4Z; the length of its shortest period is 4. Its 
binary representation is % = 000111100001111000011110 . . . Considering T 
as a sequence over Z/4Z, each number of {0, 1, 2, 3} occurs in the sequence 
with the same frequency 7. Yet if we consider T in its binary form 7^', then 
00 (as well as 11) occurs in this sequence with frequency |, whereas 01 (as 
well as 10) occurs with frequency |. 

In this subsection we show that such an effect does not take place for 
output sequences of automata described in 7.3, 7.4, and 7.7: Considering 
any of these sequences in a binary form, a distribution of k-tuples is uniform, 
for all k < n. Now we state this property formally. 




Z 



8. Properties of output sequences 
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Consider a (binary) n-cycle C = (eo^i • • • £n-i), i-e., an oriented graph on 
vertices {ao,ai, . . . , a n _i} and edges 

{(00,01), (ai,a 2 ), . . . , (a n _ 2 ,a n _i), (a n _i,a )}, 

where each vertex a,- is labelled with e 3 - G {0, 1}, j = 0, 1, . . . , n — 1. (Note 
that then (eo^i • • • £n-i) = (£n-i £ o ■ ■ ■ £ n-%) — •••■> etc.). Clearly, each purely 
periodic sequence S over Z/2Z with period ao • • ■ a n -i of length n could be 
related to a binary n-cycle C(S) = (ao . . . a n _i). Conversely, to each binary 
n-cycle (ao • • • a n -i) we could relate n purely periodic binary sequences with 
periods of length n: Those are n shifted versions of the sequence 

ao--- a n -ia ■ . - a n -\ — 

Further, a k-chain in a binary n-cycle C is a binary string /?o •••/?&— i, 
k < n, that satisfies the following condition: There exists j £ {0,1, ... , n— 1} 
such that (3i = £u+j) mod n for i = 0, 1, . . . , /c — 1. Thus, a A;-chain is just a 
string of length fc of labels that corresponds to a chain of length A; in a graph 
C. We call a binary n-cycle C k-full, if each A;-chain occurs in the graph C 
the same number r > of times. 

Clearly, if C is /c-full, then n = 2 k r. For instance, a well-known De Bruijn 
sequence is an n-full 2 n -cycle. Clearly enough that a /c-full n-cycle is (k — 1)- 
full: Each (k — l)-chain occurs in C exactly 2r times, etc. Thus, if an n-cycle 
C(S) is A>full, then each m-tuple (where 1 < m < k) occurs in the sequence 
S with the same probability (limit frequency) That is, the sequence S 
is k- distributed, see [16, Section 3.5, Definition D]. 

Definition 8.1. A purely periodic binary sequence S with the shortest 
period of length iV is said to be strictly k-distributed iff the corresponding 
iV-cycle C(S) is fc-fulL 

Thus, if a sequence S is strictly /c-distributed, then it is strictly s-distributed, 
for all positive s < k. 

Theorem 8.2 ([ ']). For the sequence Z of theorem 7.3 each binary sequence 
Z' n is strictly k-distributed for all k = 1,2, ... ,n. 

Note 8.3. Theorem 8.2 remains true for the sequence T of corollary 7.6, 
where Fj(x) = |_2^tJ mod 2 k , j = 0, 1, . . . ,m — 1, a truncation of (n — k) 
less significant bits. Namely, a binary representation T' n of the sequence T 
is a purely periodic strictly k-distributed binary sequence with a period of 
length 2 n mk. 

Theorem 8.2 treats an output sequence of a counter-dependent automaton 
as an infinite (though, a periodic) binary sequence. However, in cryptog- 
raphy only a part of a period is used during encryption. So it is natural 
to ask how 'random' is a finite segment (namely, the period) of this infi- 
nite sequence. According to [ , Section 3.5, Definition Ql] a finite binary 
sequence Eo£i ■ ■ ■ £jv-l of length is said to be random, iff 

n ¥ - 7W ( } 

for all < k < log 2 N, where v(fio ■ ■ ■ /3k— l) is the number of occurrences of 
a binary word 0o ■ ■ ■ Pk-i i n a binary word £q£i • • • £JV-l- If a finite sequence 
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is random in the sense of this Definition Ql of [16], we shall say that this 
sequence satisfies Ql. We shall also say that an infinite periodic sequence 
satisfy Ql iff its shortest period satisfies Ql. Note that, contrasting to 
the case of strict ^-distribution, which implies strict (A; — l)-distribution, it 
is not enough to demonstrate only that (8.3.1) holds for k = |log 2 iVj to 
prove a finite sequence of length iV satisfies Ql: For instance, the sequence 
1111111100000111 satisfies (8.3.1) for k = [log 2 N\ = 4 and does not satisfy 
(8.3.1) for k = 3. 

Corollary 8.4 ([ ]). The sequence Z' n of theorem 8.2 satisfies Ql ifm < 
Moreover, in this case under the conditions of 8.3 the output binary sequence 
still satisfies Ql if one truncates < k < ^ — log 2 § lower order bits (that 
is, if one uses clock output functions Fj of 8.3). 

We note here that according to 8.4 a control sequence of a counter- 
dependent automaton (see 7.3, 7.4, 7.6, and the text and examples there- 
after) may not satisfy Ql at all, yet nevertheless a corresponding output 
sequence necessarily satisfies Ql. Thus, with the use of wreath product 
techniques one could stretch 'non-randomly looking' sequences to 'randomly 
looking' ones. 

8.2. Structure. A recurrence sequence could be 'very uniformly distributed', 
yet nevertheless could have some mathematical structure that might be used 
by an attacker to break the cipher. For instance, a clock sequence x% = i 
is uniformly distributed in Z 2 . We are going to study what structure could 
have sequences outputted by our counter-dependent generators. 

Theorem 7.3 immediately implies that the j th coordinate sequence Sj(Z) = 
{Sj(xi) : i = 0, 1,2,...} (J = 0, 1, 2, . . .) of the sequence Z, i.e., a sequence 
formed by all j th bits of terms of the sequence Z, has a period not longer 
than m • 2 J . Moreover, the following could be easily proved: 

Proposition 8.5 ([ ]). (1) The j th coordinate sequence 5j(Z) is a purely 
periodic binary sequence with a period of length 2 J+1 m ; and (2) the second 
half of the period is a bitwise negation of the first half: 5j{x i+ 2j m ) = 8j{xj)+l 
(mod 2), i = 0, 1,2, ... 

Note. The j th coordinate sequence of a sequence generated by a single-cycle 
T-function is purely periodic, and 2 J+1 is the length of the shortest period 
of this sequence. The second half of the period is a bitwise negation of the 
first half, i.e., (,i+2i = d + 1 (mod 2) for each i = 0, 1,2, — 

Proposition 8.5 means that the j th coordinate sequence of the sequence 
of states of a counter-dependent generator is completely determined by the 
first half of its period; so, intuitively, it is as 'complex' as the first half of its 
period. Thus we ought to understand what sequences of length 2 J m occur 
as the first half of the period of the j th coordinate sequence. 

For j = (and m > 1) the answer immediately follows from 7.3 and 7.4 
- any binary sequence Co, . . . , c m -\ such that Sj^=o c i — 1 ( m od 2) does. 
It turns out that for j > any binary sequence could be produced as the 
first half of the period of the j th coordinate sequence independently of other 
coordinate sequences. 
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More formally, to each sequence Z described by theorem 7.3 we associate 
a sequence T(Z) = {71,72, • • •} of non- negative rational integers jj G No = 
{0, 1,2, . . .} such that < jj < 2 2Jm — 1 and the base-2 expansion of jj 
agrees with the first half of the period of the j th coordinate sequence 6j(Z) 
for all j = 1, 2, . . .; that is 

7i = Sj(x ) + 2 • Sj( Xl ) + 4 • 8j(x 2 ) + ■■■ + 2 Vm ~ l ■ 5 j (x 2Jm _ 1 ), 

where xq is an initial state; xi + \ = gi mo d m ( x i)> i = 0, 1, 2, . . .. Now we take 
an arbitrary sequence T(Z) = {71,72, • • •} of non-negative rational integers 
7j such that < 7j < 2 2Jm — 1 and wonder whether this sequence could be 
so associated to some sequence Z described by theorem 7.3. 
The answer is yes. Namely, the following theorem holds. 

Theorem 8.6 ([ ]). Let m > 1 be a rational integer, and let T = {71, 72, • • • } 
be an arbitrary sequence over N such that 7, G {0, 1, 2, ... , 2 Vm - 1} for 
all j = 1,2,.... Then there exist a finite sequence Q = {go, . . . , g m -l} 
of compatible measure preserving mappings of 7L 2 onto itself and a 2-adic 
integer xq = z G Z 2 such that Q satisfies conditions of theorem 7.3, and the 
base-2 expansion ofjj agrees with the first 2- 7 m terms of the sequence 5j(Z) 
for all j = 1,2, ... , where the recurrence sequence Z = {xq, x\, . . . £ Z2} 
is defined by the recurrence relation Xi + \ = <?j mo d m ( x i)> (* = ^,1,2,...). 
In case m = 1 the assertion holds for an arbitrary T = {70,71, • • • }, where 
7j - G {0,1,2,..., 2 2J -1}, j = 0,1,2,.... 

Proof. We will prove the theorem only for m = 1 (i.e., for T-functions) by 
two reasons. First, in this case use of methods of 2-adic analysis becomes 
more transparent, and second, the proof for m > 1 is much more technical 
and complicated (an interested reader is referred to [6]). 

Speaking informally, we fill a table with countable infinite number of 
rows and columns in such a way that the first 2- 7 entries of the j th column 
represent jj in its base-2 expansion, and the other entries of this column 
are obtained from these by applying recursive relation of Proposition 8.5; 
that is, the next 2 J entries are bitwise negation of the first 2 3 entries, the 
third 2 3 entries are bitwise negation of the second 2 J entries, etc. Then we 
read each i th row of the table as a 2-adic canonical representation of 2-adic 
integer which we denote via z%. Thus we define a set Z = {zq, z%, . . .} of 
2-adic integers. 

We shall prove that Z is a dense subset in 7L 2 , and then define / on Z 
in such a way that / is compatible and ergodic on Z . This will imply the 
assertion of the theorem. 

Proceeding along this way we claim that Z mod 2 k = Z/2 fc Z for all k = 
1, 2, 3, . . ., i.e., a natural ring homomorphism mod 2 k : z 1— ► z mod 2 k maps 
Z onto the residue ring Z/2 fc Z. Indeed, this trivially holds for k = 1. 
Assuming our claim holds for k < m we prove it for k = m. Given arbitrary 
t G {0, 1, . . . , 2 m -l} there exists z; G Z such that Zi = t (mod 2 m ~ 1 ). If Zi 
t (mod 2 m ) then 5 m -i{zi) = 6 m -i(t) + l (mod 2) and thus 5 m ^i(z i+2m -i) = 
o~m-i(t) (mod 2). However, z i+2m -i = Zi (mod 2 m ~ l ). Hence z i+2 m~i = t 
(mod 2 m ). 
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A similar argument shows that for each k G N the sequence {zi mod 
2 k : i = 0,1,2,...} is purely periodic with period length 2 k , and each t G 
{0, 1, . . . , 2 k — 1} occurs at the period exactly once (in particular, all elements 
of Z are pairwise distinct 2-adic integers). Moreover, i = %' (mod 2 k ) iff 
z,i = Z{i (mod 2 k ). Consequently, Z is dense in Z>2 since for each t £ 7,2 
and each k G N there exists G Z such that \\zi — t\\2 < 2~ fc . Moreover, 
if we define f(zi) = z i+ i for all i = 0,1,2,... then \\f(zi) - f(zi>)\\ 2 = 
\\z i+ \ - z v+ i\\2 = + 1) - {%' + 1 ) 1 1 2 = \\i ~ i'h = \\ z i ~ z i'h- Hence, / 
is well defined and compatible on Z; it follows that the continuation of / 
to the whole space Z2 is compatible. Yet / is transitive modulo 2 k for each 
k G N, so its continuation is ergodic. □ 

Note 8.7 (Representation by T-functions). Suppose m = 2 k under conditions 
of Theorem 8.6. Then, considering the sequence 5j(Z), one deals with the 
(j + m)-th coordinate sequence of a single-cycle T-function. 

8.3. Linear complexity. The latter is an important cryptographic mea- 
sure of complexity of a binary sequence; being a number of cells of the 
shortest linear feedback shift register (LFSR) that outputs the given se- 
quence 11 it estimates dimensions of a linear system an attacker must solve 
to obtain initial state. 

Theorem 8.8 ([6]). For Z and m of theorem 7.3 let Zj = 5j(Z), j > 0, 
be the j th coordinate sequence. Represent m = 2 k r, where r is odd. Then 
length of the shortest period of Zj is 2 k+ i +1 s for some s G {1, 2, . . . , r}, and 
both extreme cases s = 1 and s = r occur: For every sequence si, S2, ■ ■ ■ over 
a set {l,r} there exists a sequence Z of theorem 7.3 such that length of the 
shortest period of Zj is 2 k+ i +l Sj, (j = 1,2, . . .). Moreover, linear complexity 
\2{Zj) of the sequence Zj satisfies the following inequality: 

2 k+j + l<X 2 {Zj) <2 k+j r + l. 

Both these bounds are sharp: For every sequence t±,t2, ■ ■ ■ over a set {l,r} 
there exists a sequence Z of theorem 7.3 such that linear complexity of Zj 
is exactly 2 k+j tj + 1, (j = 1,2,...). 

Note. The linear complexity of the j-th coordinate sequence of a T-function 
is exactly 2 3 + 1, i.e., approximately half of the length of the period of the 
sequence. Note that the expectation of the linear complexity \2(C) of a 
random sequence C of length L is ^ . 

Whereas the linear complexity of a binary sequence X is the length of the 
shortest LFSR that produces X, the l-error linear complexity is the length 
of the shortest LFSR that produces a sequence with almost the same (with 
the exception of not more than t terms) period as that of X; that is, the 
two periods coincide everywhere but at t < £ places. Obviously, a random 
sequence of length L coincides with a sequence that has a period of length 
L approximately at ^ places. That is, the f-error linear complexity makes 
sense only for £ < ^. The following proposition holds. 



i.e., degree of the minimal polynomial over Z/27* of given sequence 
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Proposition 8.9. Let Z be a sequence of Theorem 7.3, and let m = 2 s > 1. 

Then for I less than the half of the length of the shortest period of the j- 
th coordinate sequence 5j(Z), the t-error linear complexity of 5j(Z) exceeds 
2 J+m ~ 1 , the half of the length of its shortest period. 

Proof. In view of Note 8.7 it suffices to prove the statement for the coordi- 
nate sequences of a T-function only. According to Proposition 8.5, the j-th 
coordinate sequence y = {xi : i = 0, 1, 2, . . .} = 5j(Z) of a T-function is a 
periodic sequence with the length of the shortest period 2 J , which satisfies 
the relation 

5 j (x i+23 ) = 5 j (x i ) + l (mod 2), (8.9.1) 
for all i = 0, 1, 2, . . . Since 2- ?+1 is the length of a period of a (binary) sequence 

y, 

w{X) = X 23+1 + l = (X + lf +1 

is a characteristic polynomial (over a field Z/2Z of two elements) of the 
sequence y. 

Let Q = i = 0, 1, 2, . . .} be a binary sequence produced by a LFSR 
with d cells such that Q has a period of length 2 J , and xi = qi for all 
i 6 {0, 1, 2, ... , 2 J+1 — 1} with the exception of I indexes j = j±, ■ ■ ■ ,ji S 
{0, 1, . • • , 2- 7+1 — 1}. Since 2 J+1 is the length of a period of Q, the minimal 
polynomial fi(X) of the sequence Q (which is of degree d then) must be a 
multiple of the polynomial X 2J+1 + 1 = (X + 1) 2J+1 over the field Z/2Z. 
Hence, fj,(X) = (X + l) d , and d < 2» +1 . 

On the other hand, if £ < 2 J , then in view of (8.9.1) the length of the 
shortest period of the sequence Q cannot be less than 2P + . Hence, d > 2- ? +l, 
since otherwise [i(x) is a multiple of (X + l) 23 = X 23 + 1; yet the latter would 
imply that Q has a period of length 2 3 . 

□ 

We can consider linear complexity of a sequence with terms from an ar- 
bitrary commutative ring, not necessarily from the field of two elements. 

Definition 8.10. Let Z = {z{\ be a sequence over a commutative ring 
R. The linear complexity \r(Z) of Z over R is the smallest r € No such 
that there exist c, cq, ci, . . . , c r _i € R (not all equal to 0) such that for all 
% = 0,1,2,... holds 

r- 1 

c + ^Cj ■ z i+j = 0. (8.10.1) 

For instance, if R = Z/p n Z; then geometrically equation (8.10.1) means 
that all the points ^s 1 , . . . , Zl+ ^~ 1 ), i = 0, 1, 2, . . ., of a unit r-dimensional 
Euclidean hypercube fall into parallel hyperplanes. For instance, with the 
use of linear complexity over the residue ring Z/2 fc Z we can study distribu- 
tion of r-tuples of the sequence produced by an ergodic T-function modulo 
2 k . We already know that this sequence, being considered as the sequence of 
elements over Z/2 fc Z is strictly uniformly distributed: Every element from 
Z/2 fc Z occurs at the period exactly once. But what about distribution of 
consecutive pairs of elements? Triples? etc. It varies... 
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For example, despite every transitive linear congruential generator Xj+i = 
a + b ■ Xi (mod 2 k ) produces a strictly uniformly distributed sequence over 
Z/2 fc Z, linear complexity over Z/2 fe Z of this generator is only 2; hence, 
distribution of pairs in produced sequences is rather poor: All the points 
that correspond to pairs of consecutive numbers fall into a small number of 
parallel straight lines in a unit square, and this picture does not depend on 
k, see Figure 6. 

Another example: The already mentioned T- function x-\-x 2 VC of Klimov 
and Shamir has a single cycle property whenever C = 5 (mod 8), or C = 7 
(mod 8), see 6.22. However, distribution of pairs of the sequence produced 
by this T-function varies from satisfactory (when there are few l's in more 
significant bit positions, see Figure 4) to poor (when there are more l's in 
these positions, see Figure 5). 

This is not easy to find a T-function that guarantees good distribution of 
pairs. For instance, this problem is not completely solved even for quadratic 
generators with a single cycle property, despite a number of works in the 
area (see e.g. [11, 9] and a survey [10]). 

However, we can prove that with respect to the linear complexity over 
residue ring the sequence X n = {P(xq) mod p n } over Z/p n Z, generated by 
compatible ergodic polynomial f(x) £ Q[x] of degree > 2, is 'asymptotically 
good' (cf. Figure 7 for distribution of pairs for a polynomial generator of 
degree 8). Namely, the following theorem holds: 

Theorem 8.11 ([ ]). lim^oo \ z / p n Z (X n ) = oo. Moreover, \%/ p n Z (X n ) 
tends to oo not slower than logra. 

We note, however, that in most real life ciphers the use of polynomials 
of higher degrees (say, of degrees higher than 2) is too time-costly; so the 
search for good functions continues! 

8.4. The 2-adic span. There are two other measures of complexity of a 
binary sequence, which were introduced in [14]: namely, 2-adic complexity 
and 2-adic span. Whereas linear complexity (which is also known as a linear 
span) is the number of cells in a linear feedback shift register outputting a 
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sequence S over Z/2, the 2-adic span is the number of cells in both memory 
and register of a feedback with carry shift register (FCSR) that outputs S, 
and the 2-adic complexity estimates the number of cells in the register of 
this FCSR. To be more exact, the 2-adic complexity $2(5) of the (eventu- 
ally) periodic sequence S = {so, si, S2, • • •} over Z/2 is log 2 (<I>(u, v)), where 
$>(u,v) = max{|u|, \v\} and ^ 6 Q is the irreducible fraction such that its 
2-adic expansion agrees with S, that is, f = so + si2 + s 2 2 2 + • • • E Z 2 . The 
number of cells in the register of FCSR producing S is then [log 2 ( < l ) ('U, v))~\ , 
the least rational integer not smaller than log 2 (<3?(ii, v)). Thus, we only need 
to estimate $2(5). 

Theorem 8.12 ([6]). Let Sj = {sq, s%, s 2 , . . . } be the j coordinate sequence 
of an ergodic T -function. Then the 2-adic complexity <I> 2 (c>j) of Sj is 



where 7 = s + s x 2 + s 2 2 2 H h s 2 i_i2 2J ~ 1 . 

Note. We note that 7 is a non-negative rational integer, < 7 < 2 2J — 1; 
also we note that for each 7 of this range there exists an ergodic mapping 
such that the first half of the period of the j th coordinate sequence of the 
corresponding output is a base-2 expansion of 7 (see Theorem 8.6). Thus, to 
find all possible values of 2-adic complexity of the j th coordinate sequence 
one has to decompose the j' th Fermat number 2 23 + 1. It is known that 
the j th Fermat number is prime for < j < 4 and that it is composite for 
5 < j < 23. For each Fermat number outside this range it is not known 
whether it is prime or composite. The complete decomposition of j th Fermat 
number is not known for j > 11. Assuming for some j > 2 the j th Fermat 
number is composite, all its factors are of the form i2 J+2 + 1, see e.g. [8] for 
further references. So, the following bounds for 2-adic complexity <j? 2 (6j) of 
the j coordinate sequence Sj hold: 




j + 3< r* 2 (S;)l <2 J + 1, 
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yet to prove whether the lower bound is sharp for a certain j > 11, or whether 
|~ < &2 ('S'j )1 could be actually less than 2 3 + 1 for j > 23 is as difficult as to 
decompose the j th Fermat number or, respectively, to determine whether the 
jth p erma i number is prime or composite. 

Proof of theorem 8.12. We only have to express so + s%2 + s 2 2 2 + . . . as 
an irreducible fraction. Denote 7 = so + s\2 + S22 2 + • • • + s 2 j_ 1 2 23_1 . 
Then using the second identity of (4.0.2) we in view of 8.5 obtain that 
s + si2 + s 2 2 2 + ■■■ + s 2]+l _ 1 2 v+1 ~ 1 = 7 + 2 V (2 V - 7 - 1) = 7' and hence 
so + Sl 2 + s 2 2 2 + • • - = 7' + 7 '2 23+1 + 7 '2 2 - 2J+1 + + ■ ■ ■ = ^ _ L 

This completes the proof in view of the definition of 2-adic complexity of a 
sequence. □ 

Note. Similar estimates of $2(^-1 (<?)) could be obtained for coordinate 
sequences of wreath products. In view of 8.5 the argument of the proof of 
8.12 gives that the representation of the binary sequence 5 n -i(S) as a 2-adic 
integer is 22 n-t"j 1 _ ) _ 1 — 1 , so we have only to study a fraction 22 n-i~^ +1 , where 

7 = so + si2 + s 2 2 2 + • • • + s 2 i-i m -_i2 2 ™ 1,71-1 , and m is of statements of 
7.4, and of 7.3. Representing m = 2 k m\ with m\ > 1 odd, we can factorize 

22"- 1 m^£ _ ^2 2 ™~ 1+fe -|- i^2 2 "~ 1+k ( mi ~ 1 ') — 2 2n ~ 1+fe ( m i- 2 ) -| 2 2 ™~ 1+fe + l) 

but the problem does not become much easier because of the first multiplier. 
We omit further details. 

9. Schemes 

In this section we are going to give some ideas how stream ciphers could 
be designed on the basis of the theory discussed above. We must now com- 
bine state update and output functions into an automaton that produces a 
sequence that might be cryptographically secure. 

9.1. Improving lower order bits. The drawback of the sequence pro- 
duced by a T-function /: TLj2 n 7L — > r L/2 nr L with the single cycle property is 
that the less significant is the bit, the shorter is the period of the sequence 
it outputs (see 8.5); that is: Despite the length of the period of the sequence 

S = {u = u, ui = /(n ), u 2 = f{ui), . . .} 

of n-bit words is 2 n , the length of the period of the j th bit sequence (i.e., 
the 7 th coordinate sequence) 

Sj = {6~j(u ),5j(u 1 ),5j(u 2 ), . . .,5j(u i+1 ), . . .} 

is only V+\ (j = 0,l,...,fc-l). 

From 8.8 it follows also that the less is j, the smaller is linear complexity 
of the coordinate sequence. Obviously, in applications we must get rid of 
this effect. 

Thus, designing a PRNG (see Fig. 1) we must understand what output 
function F one should use: F must add security, F must be balanced (for 
not to spoil the uniform distribution) , and F must cure the very unpleasant 
low order bits effect of T-functions. 

One way (that of Corollary 8.4) is to truncate low order bits. But this 
obviously will reduce the performance of the generator ... Are there other 
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7r permutes bits so that 
$o(n(xi)) = 6 n -i(xi); 
i.e., 7r sends the most 
significant bit of Xi 
to the least significant 
bit position! 



Figure 8. PRNG with a bit order reverse permutation 

ways? Since the low order bits effect is an inherent property of T-functions, 
one should include in output function some basic chip operations other than 
T-functions. Thus, output function will not be a T-function any more. 
Could one construct the output function this way, yet not 'spoil' good prop- 
erties of the sequence of states? 

A solution is given at Figure 8: We include into a composition only one 
mapping tt which permute bit order of the state (which is an n-bit word), 
sending the most significant bit (that is, (n — l)-th bit) to the least sig- 
nificant bit position. An important example of such a permutation tt is a 
word rotation, Xn-iXn-2 • • • X1X0 ^ Xn-2Xn~3 ■ ■ ■ XiXoXn-i, which is also a 
standard instruction in most processors. 

The following could be proved regarding the output sequence of the so 
constructed counter-dependent generator: 

Proposition 9.1 ([ ]). Let Hi : Z2 — > (i = 0, 1, 2, . . . , m — 1) be compat- 
ible and ergodic mappings. For x E {0, 1, . . . , 2™ — 1} let 

Fi{x) = (Hi(ir(x))) mod 2 n , 

where tt is a permutation of bits of x E Z/2™ such that So(tt(x)) = 5 n -i(x). 
Consider a sequence T of 7.6. Then the shortest period of the j th coordinate 
sequence Tj = Sji^J 7 ) (J = 0, 1, 2, . . . , n — 1) is of length 2 n kj for a suitable 
1 < kj < m. Moreover, linear complexity of the sequence J-j exceeds 2 n ~ 1 . 

9.2. The ABC stream cipher. With the use of the above considerations 
a fast software-oriented stream cipher ABC is being developed now, see [2]. 
In this subsection we outline underlying ideas of the design to demonstrate 
their relations with the theory developed above. To make these ideas more 
transparent, we consider the ABC 'template' (see Figure 9.2) rather than 
the actual design; the later has some differences from the template due to 
necessity to withstand certain attacks. However, we do not discuss these 
differences here since our aim is to illustrate the 2-adic techniques in stream 
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Ci + 1 = i(cj) 



plain text stream 



h(x) = ((((x + o ) e 60) + 01) 9 61) + a 2 



•ED- 



© 



,1,4 



/ S(x) = d + E"^ 1 ■ 5„-i-i(a;) 



encrypted text stream 



Figure 9. The ABC stream cipher template. Here L is a 
linear transformation, EB and + stand for integer addition, 
and © stands for XOR. 



cipher design rather than to give a comprehensive cryptographical analysis 
of a particular algorithm. 

The main goal of the design was to achieve high performance and to prove 
some important properties of the key stream, e.g. long period and uniform 
distribution. 

The high performance is achieved by a very restricted set of instructions 
that are used: Actually, only fastest instructions, such as +, XOR and shifts 
are allowed. That's why the clock state update function fa (c.f. Figure 8) is 
of the form hi(x) = c iiT + ((((x + a ) © bo) + ai) © h) + a 2 . 

Now recall Example 6.24 and Example 3 of 7.7. Note that L is a linear 
transformation that is produced by a linear feedback shift register of a max- 
imum period length; Cj jr is a right-hand part of the outputted word, so the 
sequence {cj )r : i = 0, 1, 2, . . .} is a LFSR sequence with a maximum period 
length. Thus, the state sequence {xi} has a maximum period length, and is 
strictly uniformly distributed. 

After producing a uniformly distributed sequence of states, we need to 
improve period lengths of output sequence. In ABC we do it with the use 
of Proposition 9.1, that is, by a circuit described by Figure 8. 

Actually, in ABC we take n to be a bit order reverse permutation, 



<5j(7r(x)) 



for all x £ Z/2 n Z. However, this permutation is rather slow in software 
since one has to work with bits rather than with words. Yet we use a trick 
to avoid this undesirable reduce of performance. The trick is based on the 
use of special output function S{x) = d + Yll=o 4? ' ^n-j-i(x), which is 
a composition of two functions, of a permutation ir, and of the function 

F(x) = d + do ■ So(x) + d\ ■ <5i (x) H . Thus, to apply Proposition 9.1, we 

must know when F is ergodic. 

The following Proposition could be proved: 
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Proposition 9.2 ([4]). The function F(x) = d + do • Sq(x) + d\ ■ 5\{x) H 

is compatible and ergodic if and only if \\d\\2 = 1, do = 1 (mod 4), and 
\\djh = 2-i for j = 1,2,... 

Now we just take clock output functions Hi (c.f. Figure 8) of the form 
Hi(x) = Ci/ + F(x), where is the left-hand part of the word produced 
by LSFR L. Thus, the circuit at Figure 9.2 is a special case of the circuit at 
Figure 8. We note, once again, that compare to the template, the real-life 
stream cipher ABC has some important differences, yet however use of the 
above mentioned ideas enable us to prove crucial cryptographic properties 
of the cipher, long period, uniform distribution and high linear complexity 
of output sequence, see [2] for details. 
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